BAIA

Extraction of text content from web-sites, especially news publications.

BAIA is a programming language which automates all housekeeping around extraction and saving of text content from web-sites. BAIA is shorthand for Basic Analysis of Internet Articles.

BAIA is ideal for applications where the extracting of text content from web sites is vital. Applications may be ad-filtering, news searching, preparing web-content for cell phones or making it easier browsing the web using text-only web-browsers.

News

News about this project can be found on the project summary page. Here is a direct link to the news-page.

The project is as of August 2005 in the process of beeing open-sourced.

Objectives

The objectives of BAIA are:

  1. Providing the specification and documentation of BAIA as a language
  2. Providing interpreters for BAIA (only Windows is currently supported through a .NET library)
  3. Generating a central repository for BAIA-templates
  4. Demonstrating possible uses for BAIA, especially a news-reader offering news in text-format

How may I participate?

Help is welcome with all four objectives listed above. Help is welcome also from persons who are  not professional software developers. If you have a basic understanding of HTML you may participate by writing a BAIA-template of your favourite news publication.

You may also help by just downloading and using our demonstration applications. The TextNews reader in "client incarnation" is specially recommended. The feedback given from users is very valuable in determining how this project should proceed.

Contact information

Original creator of BAIA is Bjørn Erling Fløtten from Trondheim, Norway.

Comments should be addressed to one of the forums on the project summary page of BAIA. Here is a direct link to the forums page.

If you don't want to register for our forums you can also reach the project administrator via

 

This site is hosted by
SourceForge.net Logo