Extensions & Plug-ins

You will notice that there is an additional directory called "examples" in the admin directory. There we give you some examples for plugins.

If you have written a plugin and would like to share it, please let us know. Send us the file and instructions on how to use it, what it does... If you allow it, we might publish it with a next version of TSEP.

Please look directly into the files for instructions if you do not find any help in the general TSEP documentation.

Also see:

At this time you find the following files there:

General

phpcrawl4tsep.php
This is for spidering / crawling a site. You find a short explanation below, for details please see Using external data supply
 
urllist.php
This hand over a plain list of URLs to TSEP which then will be indexed. You find a short explanation below, for details please see Using external data supply
fillwithcontent.php
This is a ready-to-run example-script, which "delivers" filenames AND content to TSEP (to be indexed by TSEP). The script is intended to give an idea, how content can be delivered to the TSEP-indexer. It's an equivalent to the urllist.php example script, but here, the file-content is provided directly from within this script. You find a short explanation below, for details please see Using external data supply

Short explanation for phpcrawl4tsep.php, urllist.php and fillwithcontent.php

These scripts are completly independent and do not belong to eachother!

urllist.php is an example for returning a fixed list of URLs to TSEP. This has nothing to do with PHPCRAWL.
If you want TSEP to build an index for a fixed defined list of URLs, build
this list as you did in urllist.php and use this script as external datasupply (define "example/urllist.php" as external datasupply - the parameter for the datasupply-script can be left blank - is not used in this case).

If urllist.php contains e.g.

call_user_func("TSEP_ExternalCallBack", "URL>http://www.mydomain.com/school_ov/school_id/1");

this URL is returned to TSEP to be indexed.
You have to switch ON force-parsing-via-HTTP.


phpcrawl4tsep.php runs PHPCrawl and returns all found URLs via the same mechanism like urllist.php - e.g.

call_user_func("TSEP_ExternalCallBack", "URL>http://www.mydomain.com/school_ov/school_id/1");

You have to switch ON force-parsing-via-HTTP.

urllist.php and phpcrawl4tsep.php are just examples!

The only thing an "External datasupply"-script has to ensure is, that URLs to be indexed are sent to TSEP via

call_user_func("TSEP_ExternalCallBack", "URL>url2bIndexed");

Not less not more!

You also can use

call_user_func("TSEP_ExternalCallBack", "ALL>url2bIndexed<TSEPCONTENT>text");

to directly return "text" as the files-content; in this case, TSEP does not
open/read/parse the file (see examples/fillwithcontent.php for this).

Question:

what is the difference between using:

"ALL>

and

"URL>

I wasn't clear from reading the documentation. Is it possible to use code like:

"URL>http://www.oururl.com<TSEPCONTENT><html><head><title>Title</title></head><body> Body Text</body></html>"

Or must the "ALL> command be used? What difference in indexing will it make?

Answer:

URL means: only the URL is sent back to the TSEP-indexer (and the TSEP-indexer has to open the url, read and parse it's content).

ALL means, that ALL data are sent back to the TSEP-indexer; these are

  1. the URL and
  2. the CONTENT.

In this case, the TSEP-indexer takes the URL and CONTENT "as is" (the url does not even need to exist).

In your example, you must use:
"ALL>http://www.oururl.com<TSEPCONTENT><html><head><title>Title</title></head><body>Body Text</body></html>"

For further details please see this discussion in the TSEP forum (external link): Thread 1258790

For *nix systems only

TSEPautoIndexing.sh
Using this file you can run TSEP using "cron". Please see the section "Scheduling: cron / at" for this extensive topic. This is the equivalent for *nix systems to TSEPautoIndexing.cmd (Windows)
wwws.sh
This uses (is an example how to use) wwwshot.sh
wwwshot.sh
is intended to build screenshot from webpages. Please see the file itself for documentation on how to use it.
Attention: This has not been tested yet. If you test this, please let us know of any results (good and bad e.g. errors) We will introduce pictures for the websites in a future version of TSEP.

For Windows systems only

TSEPautoIndexing.cmd
This is the equivalent for windows systems to TSEPautoIndexing.sh. Using this file you can run TSEP using "at". Please see the section "Scheduling: cron / at" for this extensive topic.