Fonctionnalités
Dernière mise à jour
Cet article vous a-t-il été utile ?
Dernière mise à jour
Cet article vous a-t-il été utile ?
A crawling is a process which browses any of the Origines specified by the User in a methodical, automated manner.
The Workflow and Launching of the crawler is as follows. An Origine ca be selected from the Origine table and can be launched. The parameters of crawler has been already set
Once the crawling has been done, the contents are inserted into a Collecte table as shown below.
The structure of the collecte List is as follows
Collecte List Parameter
Description
Id
Unique ID value representing each collecte
Nom
Name of the Collecte
URL's
The root URL(Origine) from which the collecte was started
Debut
When was the collecte started
Duree
The complete time taken by the crawler to collecte the url's
If an individual collecte is chosen, then we have the CRUD operation for the collecte table as shown below
The structure of the Collecte Table is as follows
Collecte Table Parameter
Description
Url
Each URL that has been crawled from its Origine
Langue
Language of each URL (inherited from its Origine)
Type
type of the url which has been crawled like xml or html or pdf
Facets
The Pre-indexation can be done even at the url level also
A Parser
is set to oui
if the url is to be parsed and non
otherwise