Fonctionnalités

Currently DREAM Parser can parse the following types of documents

HTML - Hypertext Markup Language(HTML) is a text-based approach to describing how content contained within an HTML file is structured.
PDF - Our Parser can also recognise the text contents in the Portable Document Format(PDF) and analyses the corrseponding facets for it
XML - An Extended Markup Language(XML) is a metalanguage which allows users to define their own customized markup languages, especially in order to display documents on the Internet. It also uses an Intermediary table known as Inter Table

Launching

The launching of the Parser is explained as follows

The User can select which collecte to be parsed from the collecte list.
Once the collecte to be parsed has been selected, then click on Lancement button to start the parsing

The data after the parsing will be fed into a table called as Indexation Engine. The CRUD for this table has been provided if a user has any manual updates to be done for any of the url's that have been parsed.

An example of the IE Table is shown below

Waiting for Marius update of the new IE Table on prodn

The structure of the Indexation Engine is as follows

IE Table elements

Meaning

url

The url of the documents that has been parsed

title

The title of the documents that has been parsed

Fonction

Facet calculated during Parsing Not overwritten if pre-indexed

Secteur

Facet calculated during Parsing Not overwritten if pre-indexed

Type d'Info

Facet calculated during Parsing Not overwritten if pre-indexed

Theme

Facet calculated during Parsing Not overwritten if pre-indexed

Operations

CRUD feature for manually updating the indexation

PrécédentInstallation

Mis à jour il y a 4 ans

Ce contenu vous a-t-il été utile ?