Notebook / Archives / "Java"

"Java" entries.

May 25, 2005

Web proxy systems, mainly in Java

After my previous entry entitled Structured graphics, diagramming, graphs and networks in Java one month ago, here come this new entry about web proxies, starting my researches from the article Open Source Personal Proxy Servers Written In Java of Manageability.

Generic proxy systems

Transcoding architectures

  • IBM Research Web Intermediaries (WBI):

    Aiming to produce a more powerful and flexible web, we have developed the concept of intermediaries. Intermediaries are computational entities that can be positioned anywhere along an information stream and are programmed to tailor, customize, personalize, or otherwise enhance data as they flow along the stream.
    A caching web proxy is a simple example of an HTTP intermediary. Intermediary-based programming is particularly useful for adding functionality to a system when the data producer (e.g., server or database) or the data consumer (e.g., browser) cannot be modified.
    Web Intermediaries (WBI, pronounced "webby") is an architecture and framework for creating intermediary applications on the web. WBI is a programmable web proxy and web server. We are now making available the WBI Development Kit for building web intermediary applications within the WBI framework, using Java APIs. Many types of applications can be built with WBI; you can also download some plugins.
    One key intermediary application is the transformation of information from one form to another, a process called transcoding. In fact, the WBI Development Kit now provides the same plugin APIs as IBM WebSphere Transcoding Publisher. Applications developed with WBI version 4.5 can be used with the Transcoding Publisher product (with a few exceptions), as WBI constitutes the backbone on which transcoding operations run.
    Other examples of intermediary applications include:
    * personalizing the web
    * password & privacy management
    * awareness and interactivity with other web users
    * injecting knowledge from "advisors" into a user's web browsing
    * filtering the web for kids
    WBI has an interesting and entertaining history.

    [quite old: the last update of The WBI Development Kit's tech page was made on “March 25,2004”, but the downloadable files are older (2000, June) – alphaWorks License]

    There are two related webpages on IBM alphaWorks: the WBI Development Kit for Java webpage and the Transcoding Proxy webpage.

    Many interesting papers can be found in the Publications section, coming from the Almaden Research Center of IBM Research.

    WebFountain is another project from IBM Research (UIMA: The Unstructured Information Management Architecture Project) dealing with the problem of search in not-full-structured data: WebFountain is a set of research technologies that collect, store and analyze massive amounts of unstructured and semi-structured text. It is built on an open, extensible platform that enables the discovery of trends, patterns and relationships from data. Again, papers in the Publications section. IBM alphaWorks owns a Semantics research topic area.

  • AT&T Mobile Network:

    AT&T Mobile Netowrk (AMN - formerly known as iMobile) is a project that addresses the research issues in building mobile service platforms. AMN currently consists of three editions: Standard Edition (SE), Enterprise Edition (EE), and Micro Edition (ME).
    AMN SE was built by extending iProxy, a programmable proxy. The proxy maintains user and device profiles, accesses and processes internet resources on behalf of the user, keeps track of user interactions, and performs content transformations according to the device and user profiles. The user accesses internet services through a variety of wireless devices and protocols (cell phones with SMS, WAP phones, PDA's, AOL Instant Messenger, Telnet, Email, Http, etc.)
    The main research issues in AMN include
    * Authentication: How does the proxy and associated services authenticate the user?
    * Profile Management: How does the proxy maintain the user and device profiles? How do the profiles affect the services?
    * Trancoding Service: How does the proxy map various formats (HTML, XML, WML, Text, GIF, JPEG, etc.) from one form to the other?
    * Personalized Services: How can new services be created by taking advantage of the user access logs and location/mobility information?
    * Deployment: How should the proxy be deployed? On the server side, on the network, on the client side, or should we use a mixed approach?

    [no download]

    Developped by AT&T Labs – Research.

  • The transcoders project on java.net:

    [not active – Apache Software License]

    There are somewhat interesting references in the description of the project.

  • Pluxy:

    Pluxy is a modular Web proxy which can receive a dynamic set of services. Pluxy provides the infrastructure to download services, to execute them and to make them collaborate. Pluxy comes with a set of basic services like collaborative HTTP request processing, GUI management and distributed services. Three Pluxy applications are introduced: a collaborative filtering service for the Web, an extended caching system and a tool to know about document changes.

    [dead: last modifications of the webpages were made from March to April, 1999]

    Two papers written by Olivier Dedieu from the INRIA's SOR Action are available, with the same title: Pluxy : un proxy Web dynamiquement extensible (INRIA Research Report RR-3417, may 1998) and Pluxy : un proxy Web dynamiquement extensible (1998 NoTeRe colloquium, 20-23 october 1998).

  • eRACE Project:

    The extensible Retrieval Annotation Caching Engine (eRACE) is a middleware system designed to support the development and provision of intermediary services on Internet. eRACE is a modular, programmable and distributed proxy infrastructure that collects information from heterogeneous Internet sources and protocols according to end-user requests and eRACE profiles registered within the infrastructure.
    Collected information is stored in a software cache for further processing, personalized dissemination to subscribed users, and wide-area dissemination on the wireline or wireless Internet. eRACE supports personalization by enabling the registration, maintenance and management of personal profiles that represent the interests of individual users. Furthermore, the structure of eRACE allows the customization of its service provision according to information-access modes (pull or push), client-proxy communication (wireline or wireless; email, HTTP, WAP), and client-device capabilities (PC, PDA, mobile phone, thin clients).Finally, eRACE supports the ubiquitous provision of services by decoupling information retrieval, storage and filtering from content publishing and distribution.
    eRACE can easily incorporate mechanisms for providing subscribed users with differentiated service-levels at the middleware level. This is achieved by the translation of user requests and eRACE profiles into ``eRACE requests'' tagged with QoS information. These requests are scheduled for execution by an eRACE scheduler, which can make scheduling decisions based on the QoS tags.
    Performance scalability is an important consideration for eRACE given the expanding numbers of WWW users, the huge increase of information sources available on the Web, and the need to provide robust service. To this end, the performance-critical components of eRACE are designed to support multithreading and distributed operation, so that they can be easily deployed on a cluster of workstations.
    The eRACE system consists of protocol-specific proxies, like WebRACE, mailRACE, newsRACE and dbRACE, that gather information from the World-Wide Web, POP3 email-accounts, USENET NNTP-news, and Web-database queries, respectively. At the core of eRACE lies a user-driven, high-performance and distributed crawler, filtering processor and object cache, written entirely in Java. Moreover, the system employs Java-based mobile agents to enhance the distribution of loads and its adaptability to changing network-traffic conditions.

    [old: no paper since 2002]

    The Papers section contains many publications, from Marios Dikaiakos (Department of Computer Science, University of Cyprus) and Demetris Zeinalipour (now Department of Computer Science & Engineering, University of California, Riverside) among others. For example: Intermediary Infrastructures for the WWW.

  • Platform for Information Applications (PIA):

    The Platform for Information Applications (PIA) is an open source framework for rapidly developing flexible, dynamic, and easy to maintain information browser-based applications. Such applications are created without programming and can be maintained by users and office administrators.
    This framework has been used to build a broad range of applications, including a "workflow" web server that handles all of the purchase authorizations, time cards, and other (ex-)paperwork at Ricoh Innovations, Inc.
    The PIA does this by separating an application into a core processing engine (a shared software engine, akin to a web server) and a task-specific collection of "active" XML pages, which specify not only the content but also the behavior of the application (XML is the W3C standard, for eXtensible Markup Language). So one document, by itself, can include other documents (or pieces of them), iterate over lists, make decisions, calculate, search/substitute text, and in general do almost anything a traditional "CGI script" or document processing program would do.
    Application developers can extend the basic set of HTML and PIA elements ("tags") by defining new ones in terms of the existing ones. As a result, a PIA application can be customized simply by editing a single, familiar-looking XML page... in contrast to conventional Web applications, where even a simple change (like adding an input field) might require finding and fixing Perl CGI scripts or recompiling Java classes in several directories.

    [very old: “2.1.6 built Tue Apr 3 11:57:21 PDT 2001”]

  • The concept of “transcoding services” is implemented on the server side in Apache Cocoon:

    Apache Cocoon is a web development framework built around the concepts of separation of concerns and component-based web development.
    Cocoon implements these concepts around the notion of 'component pipelines', each component on the pipeline specializing on a particular operation. This makes it possible to use a Lego(tm)-like approach in building web solutions, hooking together components into pipelines without any required programming.
    Cocoon is "web glue for your web application development needs". It is a glue that keeps concerns separate and allows parallel evolution of all aspects of a web application, improving development pace and reducing the chance of conflicts.

    [active – Apache Software License]

Back to 1995, the paper which founded a big part of the concept: Application-Specific Proxy Servers as HTTP Stream Transducers.

This kind of services are being standardized by the Open Pluggable Edge Services (OPES) Working Group of the IETF (The Internet Engineering Task Force):

The Internet facilitates the development of networked services at the application level that both offload origin servers and improve the user experience. Web proxies, for example, are commonly deployed to provide services such as Web caching, virus scanning, and request filtering. Lack of standardized mechanisms to trace and to control such intermediaries causes problems with respect to failure detection, data integrity, privacy, and security.
The OPES Working Group has previously developed an architectural framework to authorize, invoke, and trace such application-level services for HTTP. The framework follows a one-party consent model, which requires that each service be authorized explicitly by at least one of the application-layer endpoints. It further requires that OPES services are reversible by mutual agreement of the application endpoints.

Web filtering

  • Muffin – World Wide Web Filtering System:

    * Written entirely in Java. Requires JDK 1.1
    Runs on Unix, Windows 95/NT, and Macintosh.
    * Freely available under the GNU General Public License.
    * Support for HTTP/0.9, HTTP/1.0, HTTP/1.1, and SSL (https).
    * Graphical user interface and command-line interface.
    * Remote admin interface using HTML forms.
    * Includes several filters which can remove cookies, kill GIF animations, remove advertisements,
    add/remove/modify arbitrary HTML tags (like blink), remove Java applets and Javascript, user-agent spoofing, rewrite URLs, and much more.
    * View all HTTP headers to aid in CGI development and debugging.
    * Users can write their own filters in Java using the provided filter interfaces.

    [very old: last file release on SourceForge.net on “April 4, 2000” – GPL]

    Old but simple. The thesis slides describing Muffin can be read on the website.

  • PAW (Pro-Active Webfilter)
    AW (pro-active webfilter) is an Open-Source filtering HTTP proxy based on the Brazil Framework provided as a Open-Source Project by SUN. Because the Brazil Framework and PAW are written in Java the software is highly portable.
    PAW allows for easy plugin of Handlers (filter outgoing requests) and Filters (filter incoming data - the HTML response) and a GUI for end users. All the configuration files are in XML-Format and thus easy to modify (even without the GUI).
    It's aim is to provide an easy to use interface for end users and to be easily extendable by developers. PAW consists of the followig components:
    * PAW Server which implements the filtering HTTP Proxy.
    * PAW GUI for easy PAW Server administration.

    [old: last file release on SourceForge.net on «January 17, 2003» – Apache Software License – uses the Sun Brazil Web Application framework]

  • Privoxy
    Privoxy is a web proxy with advanced filtering capabilities for protecting privacy, modifying web page content, managing cookies, controlling access, and removing ads, banners, pop-ups and other obnoxious Internet junk. Privoxy has a very flexible configuration and can be customized to suit individual needs and tastes. Privoxy has application for both stand-alone systems and multi-user networks.
    Privoxy is based on Internet Junkbuster.

    [still active but last file release on SourceForge.net on “January 30, 2004” – GPL – coded in C]

  • The Proxomitron (also there, more information on Proxomitron.info)
    For those who have not yet been introduced, meet the Proxomitron: a free, highly flexible, user-configurable, small but very powerful, local HTTP web-filtering proxy.

    [old and dead: “There were two separate releases of Proxomitron 4.5, one in May of 2003 and one in June.” – for Windows]

    Jon Udell wrote an article entitled SSL Proxying – Opening a window onto secure client/server conversations inspired by the SSL support in Proxomitron. He showed inside it how to code a very simple web proxy with Perl (libwww-perl is a powerful library which can be used to develop web application with Perl).

  • Amit's Web Proxy Project [coded in Python]: Proxy 2 [dead: “1997”], Proxy 3 [dead: “1998”], Proxy 4 [dead: “2000”], and Proxy 5 [dead: “[2005-04-12] A lot of the HTML-modifying tricks I wanted to implement are easier to implement in GreaseMonkey, so I haven't had much motivation to work on a proxy to do these things. See a list of GreaseMonkey plugins.”].

    Amit J. Patel worked on this subject while doing his thesis (more here). He links on his webpage to A list of open-source HTTP proxies written in python, a very complete list on Web proxies in Python.

  • FilterProxy:

    FilterProxy is a generic http proxy with the capability to modify proxied content on the fly. It has a modular system of filters which can modify web pages. The modular system means that many filters can be applied in succession to a web page, and configuration is easy and flexible. FilterProxy can proxy any data served by the HTTP protocol (i.e. anything off the web), and filter any recognizable mime-type. All configuration is done via web-based forms, or editing a configuration file. It was created to fix some of the annoyances of poor web design by rewriting it. It also can improve the web for you, in both speed (Compress) in quality (Rewrite/XSLT). After ads (and their graphics) are stripped out, and html is compressed, surfing over a modem is much faster. Compare to Muffin (a similar project in java), and WebCleaner (a similar project in python) in purpose and functionality. FilterProxy is written in perl, and is quite fast.

    [old: last file release on SourceForge.net on “January 12, 2002” – GPL – coded in Perl]

  • The V6 Web Engine:

    V6 is to the Web what pipes are in Unix systems: a compositional device to combine document processing. To be easily integrated in the Web architecture, V6 is available as a personal proxy. Relying on a common skeleton architecture and Web related libraries, V6 can be easily configured to support various sets of filters while remaining portable and browser independent. The filters may act on the requests emitted by the browser (or other web client) or on the document returned by a server, or both.
    In the current release, the available filters include
    * flexible caching
    * request redirection
    * HTML filtering (based on NoShit)
    * global history
    * on-the-fly full text indexing
    V6 can be used to support many other navigation aids and Web-related tools in a uniform, browser independent way. In addition, V6 can also be used as a traditional http server: this is particularly useful to serve private files without needing access to the site-wide http server, or to interface to local, private applications (mail, ...) through the CGI interface.

    [archeology: last paper from 1996 and “V6 was written mostly in 1995/1996. Development and maintenance stopped in 1997” (there) – Copyright INRIA – coded in Objective Caml]

    Useful for very old references: the position paper for example.

  • A new way of filtering web content is through greasemonkey:

    Greasemonkey is a Firefox extension which lets you to add bits of DHTML ("user scripts") to any web page to change its behavior. In much the same way that user CSS lets you take control of a web page's style, user scripts let you easily control any aspect of a web page's design or interaction.

    Downside: Firefox needed.

Web 1.0 experience augmentation

  • The MeStream Proxy:

    ThemeStream is an online "personal interest" site. It works on a self-publishing model; authors may post articles freely in a wide variety of categories. Unfortunately, its reader-based rating system is not particularly reliable, nor is it customizable. The MeStream Proxy allows users to customize how they view ThemeStream and rate ThemeStream content.

    [very old: last file release on SourceForge.net on “July 31, 2000” – GPL – MeStream was developped using the WBI development kit]

    Note: ThemeStream is dead.

Identity management (ala RoboForm)

  • Super Proxy System (SPS):

    Super Proxy System is the combination of a proxyserver and a mailserver.
    In addition to relaying the request and response between the user client and remote server, proxyserver also provides some special functions. For example, it helps fill in the form appearing on the webpage. This will release the user from inputing the data every time when browsing some websites such as New York Times(www.nytimes.com). And all kinds of filters can be included if the user wants so that such annoyances as cookies, pop-up windows and javascript can be removed, which will protect your provicy when you surf the internet.
    A special mailserver is built together with proxyserver, which is necessary in some cases where a confirmation email should be replied when registering the account in a form.
    Super Proxy System makes your web surfing easy and secure.
    Super Proxy System can be run in a local area network or individually.

    [quite old: “Last Updated Jan. 25, 2003”]

    Two members of this project (from the New York University) have interesting lists of publications : David Mazières (papers), and Helen Nissenbaum (papers) .

Web accelerators

  • Squid Web Proxy Cache:

    Squid is...
    * a full-featured Web proxy cache
    * designed to run on Unix systems
    * free, open-source software
    * the result of many contributions by unpaid (and paid) volunteers
    Squid supports...
    * proxying and caching of HTTP, FTP, and other URLs
    * proxying for SSL
    * cache hierarchies
    * ICP, HTCP, CARP, Cache Digests
    * transparent caching
    * WCCP (Squid v2.3 and above)
    * extensive access controls
    * HTTP server acceleration
    * SNMP
    * caching of DNS lookups

    [active ;-) – GPL – coded in C]

    The reference in the UNIX world (not written in Java). Interesting Related Software webpage on the Squid website.

  • RabbIT proxy for a faster web:

    RabbIT is a web proxy that speeds up web surfing over slow links by doing:
    * Compress text pages to gzip streams. This reduces size by up to 75%
    * Compress images to 10% jpeg. This reduces size by up to 95%
    * Remove advertising
    * Remove background images
    * Cache filtered pages and images
    * Uses keepalive if possible
    * Easy and powerful configuration
    * Multi threaded solution written in java
    * Modular and easily extended
    * Complete HTTP/1.1 compliance
    RabbIT is a proxy for HTTP, it is HTTP/1.1 compliant (testing being done with Co-Advisors test, http://coad.measurement-factory.com/) and should hopefully support the latest HTTP/x.x in the future. Its main goal is to speed up surfing over slow links by removing unnecessary parts (like background images) while still showing the page mostly like it is. For example, we try not to ruin the page layout completely when we remove unwanted advertising banners. The page may sometimes even look better after filtering as you get rid of pointless animated gif images.
    Since filtering the pages is a "heavy" process, RabbIT caches the pages it filters but still tries to respect cache control headers and the old style "pragma: no-cache". RabbIT also accepts request for nonfiltered pages by prepending "noproxy" to the adress (like http://noproxy.www.altavista.com/). Optionally, a link to the unfiltered page can be inserted at the top of each page automatically.
    RabbIT is developed and tested under Solaris and Linux. Since the whole package is written in java, the basic proxy should run on any plattform that supports java. Image processing is done by an external program and the recomended program is convert (found in ImageMagick). RabbIT can of course be run without image processing enabled, but then you lose a lot of the time savings it gives.
    RabbIT works best if it is run on a computer with a fast link (typically your ISP). Since every large image is compressed before it is sent from the ISP to you, surfing becomes much faster at the price of some decrease in image quality. If some parts of the page are already cached by the proxy, the speedup will often be quite amazing. For 1275 random images only 22% (2974108 bytes out of a total of 13402112) were sent to the client. That is 17 minutes instead of 75 using 28.8 modem.
    RabbIT works by modifying the pages you visit so that your browser never sees the advertising images, it only sees one fixed image tag (that image is cached in the browser the first time it is downloaded, so sequential requests for it is made from the browsers cache, giving a nice speedup). For images RabbIT fetches the image and run it through a processor giving a low quality jpeg instead of the animated gif-image. This image is very much smaller and download of it should be quick even over a slow link (modem).

    [active: last file release on SourceForge.net on “January 11, 2005” – BSD License]

  • WWWOFFLE (World Wide Web Offline Explorer)
    The wwwoffled program is a simple proxy server with special features for use with dial-up internet links. This means that it is possible to browse web pages and read them without having to remain connected.

    [old: “Version 2.8 of WWWOFFLE released on Mon Oct 6 2003” – GPL – coded in C]

HTTP debugger and HTTP/HTML awareness

  • WebScarab (from OWASP – The Open Web Application Security Project):

    WebScarab is a framework for analysing applications that communicate using the HTTP and HTTPS protocols. It is written in Java, and is thus portable to many platforms. In its simplest form, WebScarab records the conversations (requests and responses) that it observes, and allows the operator to review them in various ways.
    WebScarab is designed to be a tool for anyone who needs to expose the workings of an HTTP(S) based application, whether to allow the developer to debug otherwise difficult problems, or to allow a security specialist to identify vulnerabilities in the way that the application has been designed or implemented.
    A framework without any functions is worthless, of course, and so WebScarab provides a number of plugins, mainly aimed at the security functionality for the moment. Those plugins include:
    * Fragments - extracts Scripts and HTML comments from HTML pages as they are seen via the proxy, or other plugins
    * Proxy - observes traffic between the browser and the web server. The WebScarab proxy is able to observe both HTTP and encrypted HTTPS traffic, by negotiating an SSL connection between WebScarab and the browser instead of simply connecting the browser to the server and allowing an encrypted stream to pass through it. Various proxy plugins have also been developed to allow the operator to control the requests and responses that pass through the proxy.
    o Manual intercept - allows the user to modify HTTP and HTTPS requests and responses on the fly, before they reach the server or browser.
    o Beanshell - allows for the execution of arbitrarily complex operations on requests and responses. Anything that can be expressed in Java can be executed.
    o Reveal hidden fields - sometimes it is easier to modify a hidden field in the page itself, rather than intercepting the request after it has been sent. This plugin simply changes all hidden fields found in HTML pages to text fields, making them visible, and editable.
    o Bandwidth simulator - allows the user to emulate a slower network, in order to observe how their website would perform when accessed over, say, a modem.
    * Spider - identifies new URLs on the target site, and fetches them on command.
    * Manual request - Allows editing and replay of previous requests, or creation of entirely new requests.
    * SessionID analysis - collects and analyses a number of cookies (and eventually URL-based parameters too) to visually determine the degree of randomness and unpredictability.
    * Scripted - operators can use BeanShell to write a script to create requests and fetch them from the server. The script can then perform some analysis on the responses, with all the power of the WebScarab Request and Response object model to simplify things.
    Future development will probably include:
    * Parameter fuzzer - performs automated substitution of parameter values that are likely to expose incomplete parameter validation, leading to vulnerabilities like Cross Site Scripting (XSS) and SQL Injection.
    * WAS-XML Static Tests - leveraging the OASIS WAS-XML format to provide a mechanism for checking known vulnerabilities.
    As a framework, WebScarab is extensible. Each feature above is implemented as a plugin, and can be removed or replaced. New features can be easily implemented as well. The sky is the limit! If you have a great idea for a plugin, please let us know about it on the list.
    There is no shiny red button on WebScarab, it is a tool primarily designed to be used by people who can write code themselves, or at least have a pretty good understanding of the HTTP protocol. If that sounds like you, welcome! Download WebScarab, sign up on the subscription page, and enjoy!

    [active: last release on “20050222”]

  • Charles Web Debugging Proxy:

    Charles is an HTTP proxy / HTTP monitor / Reverse Proxy that enables a developer to view all of the HTTP traffic between their machine and the Internet. This includes requests, responses and the HTTP headers (which contain the cookies and caching information).
    Charles can act as a man-in-the-middle for HTTP/SSL communication, enabling you to debug the content of your HTTPS sessions.
    Charles simulates modem speeds by effectively throttling your bandwidth and introducing latency, so that you can experience an entire website as a modem user might (bandwidth simulator).
    Charles is especially useful for Macromedia Flash developers as you can view the contents of LoadVariables, LoadMovie and XML loads.

    [seems still active: last update on freshmeat.net on “25-Dec-2004”]

  • Surfboard:

    Surfboard is a filtering HTTP 1.1 proxy. It features dynamic filter management through an interactive HTML console, IP tunneling, WindowMaker applets, and a suite of filters. See the Features page for details.
    Who should use this? Surfboard is a "personal proxy", intended to be used by individuals rather than organizations. It's purpose is not to censor or monitor surfing activity, nor is it intended to implement caching within the proxy. Filters could be written to do these things, but it's not something I'm personally interested in doing, and it's already available in other proxies. My goal with surfboard is to make a proxy that covers new ground and let's you "surf in style" by adding visual feedback, interaction, and network load balancing to make websurfing more enjoyable.
    Why another filtering proxy? A long time ago, I wanted a simple way to examine HTTP headers for a project I was working on. All the existing proxies I found were overkill for what I wanted, and were nontrivial to configure. So instead, I spent a lunch break writing a very simple proxy in Java that did everything I needed. Later I modified it to remove certain types of banner ads, but I was unhappy with the code -- it was ugly and difficult to maintain. I imagined that someday I would re-write it and "do it right", making everything dynamic with a browser-enabled console, some WindowMaker applets to visualize HTTP activity and to toggle filters on/off on the fly, etc. The typical second-system effect, in other words :-)

    [old: last file release on SourceForge.net on “January 12, 2002” – GPL – mainly in Java, but frontend parts coded in C]

  • the Axis TCP Monitor (tcpmon):

    A lightweight Java TCP proxy (from the Axis project).

Personal assistant

  • Webmate:

    WebMate is part of the Intelligent Software Agents project headed by Katia Sycara. It is a personal agent for World-Wide Web browsing and searching developed by Liren Chen. It accompanies you when you travel on the internet and provides you what you want.
    Features
    * Searching enhancement, including parallel search (it can send search request to the current popular search engines and get results from them, reorder them according to how much overlapping among the different search engines), searching keywords refinement using our relevant keywords extraction technology, relevant feedback, etc.
    * Browsing assistant, including learning your current interesting, recommending you new URLs according to your profile and selected resources, giving some URL a short name or alias, monitoring bookmarks of Netscape or IE, getting more like the current browsing page, sending the current browsing page to your friends, prefetching the following hyperlinks at the current browsing page, etc.
    * Offline browsing, including downloading the following pages from the current page for offline browsing, getting the references of some pages and printing it out. * Filtering HTTP header, including recording http header and all the transactions between your browser and WWW servers, filtering the cookie to protect your privacy, block the animation gif file to speed up your browsing, etc.
    * Checking the HTML page to find the errors in it, checking embedded links in to find the dead links for your learning to write HTML pages or maintain your webmate site, etc.
    * Dynamically setting up all kinds of resources, including search engines, dictionaries available in the WWW, online translation systems available in the WWW, etc.
    * Programming in Java, independent of operating system, runing in multi-thread.

    [dead: downlodable file from March, 2000]

    There is a paper about Webmate here (other – older – papers there). The developer, Liren Chen wrote other interesting personal agents. He/She (?) works in The Intelligent Software Agents Lab from The Robotics Institude, School of Computer Science of the Carnegie Mellon University, headed by Katia Sycara (a lot of publications).

Knowledge augmentation and retrieval

Knowledge augmentation

  • Scone – “A Java Framework to Build Web Navigation Tools”:

    Scone is a Java Framework published under the GNU GPL, which was designed to allow the quick development and evaluation of new Web enhancements for research and educational purposes. Scone is focussed on tools which help to improve the navigation and orientation on the Web.
    Scone has a modular architecture and offers several components, which can be used, enhanced and programmed using a plugin concept. Scone plugins can augment Web browsers or servers in many ways. They can:
    * generate completely new views of Web documents,
    * show extra navigation tools inside an extra window next to the browser,
    * offer workgroup tools to support collaborative navigation,
    * enrich web pages with new navigational elements,
    * help to evaluate such prototypes in controlled experiments etc.

    [latest version: “Version 1.1.34 from 13. Nov 2004” Scone uses “IBM's WBI (Web Based Intermediary) as Proxy”]

    On the Related Projects page, many interesting tools are mentionned; among them: HTMLStreamTokenizer (“HtmlStreamTokenizer is an HTML parser written in Java. The parser classifies the HTML stream is into three broad token types: tags, comments, and text.”), HTTPClient (“This package provides a complete http client library. It currently implements most of the relevant parts of the HTTP/1.0 and HTTP/1.1 protocols, including the request methods HEAD, GET, POST and PUT, and automatic handling of authorization, redirection requests, and cookies. Furthermore the included Codecs class contains coders and decoders for the base64, quoted-printable, URL-encoding, chunked and the multipart/form-data encodings.” – there are other interesting stuff on the webpage) and WebSPHINX: A Personal, Customizable Web Crawler:

    WebSPHINX (Website-Specific Processors for HTML INformation eXtraction) is a Java class library and interactive development environment for web crawlers. A web crawler (also called a robot or spider) is a program that browses and processes Web pages automatically.

    On the WebSPHINX webpage, one can find a list of other web crawlers and some references.

    Scone has got an unbelievable architecture, “developed as a research project at the Distributed Systems and Information Systems Group (VSIS) [from the Department of Informatics] of the University of Hamburg”. In the Documentation section, there are many papers and theses (the list of people in the project is on the main page). Many prototypes were also developed; BrowsingIcons being one of the most impressing: “BrowsingIcons is a tool to support revisitation of Web pages. To do this, it dynamically draws dynamic graphs of the paths of users as they surf the Web. Compared to using a plain browser, people can revisit web pages faster when they use these visualizations. A study showed that they also enjoy the visualizations more than Netscape alone.”

  • AgentFrank:

    The goal of Agent Frank is to be a personal intelligent intermediary and companion to internet infovores during their daily hunter/gatherer excursions. Whew. Okay, so what does that mean? Well, let's take it one buzzword at a time:
    Personal - While employing many traditionally server-side technologies, Agent Frank is intended to reside near the user, on the desktop or the laptop.
    Intelligent - Agent Frank wants to learn about the user, observe preferences and habits, and become capable of automating many of the tedious tasks infovores face. Eventually, this will come to involve various forms of machine learning and analysis, & etc.
    Intermediary - Amongst Agent Frank's facilities are network proxies that can be placed between local clients and remote servers. Using these, Agent Frank can tap into the user's online activities in order to monitor, archive, analyze, and alter information as it flows. For example, using a web proxy, Agent Frank can log sites visited, analyze content, filter out ads or harmful scripting.
    Companion - Agent Frank's ultimate purpose is to accompany an infovore and assist along the way.
    Agent Frank is, at least initially, a laboratory for hacker/infovores to implement and play with technologies and techniques to fulfill the above goals. At its core, Agent Frank is a patchwork of technologies stitched together into a single environment intended to enable this experimentation. At the edges, Agent Frank is open to plugins and scripting to facilitate quick development and playing with ideas.
    Agent Frank wants to be slick & clean one day, but not today. Instead, it is a large and lumbering creature with all the bolts, sockets, and stitches still showing. This is a feature, not a bug.

    [old: the last release was on “20030215” – GPL]

    Very impressing job done by Leslie Michael Orchard! This platform uses many open source tools:

    Implemented in Java, with an intent to stick to 100% pure Java.
    Makes use of Jetty for an embedded browser-based user interface
    Employs BeanShell to provide a shell prompt interface and scripting facilities
    RDF metadata is employed via the Jena toolkit
    Web proxy services are provided via the Muffin web proxy
    Text indexing and searching enabled by Jakarta Lucene.
    Exploring use of HSQL and/or Jisp for data storage.
  • The Arakne Environment:

    an open collaborative hypermedia system for the Web
    [old]

    There are a few interesting papers on the project page, written by the creator Niels Olof Bouvin from the Departement of Computer Science – DAIMI of the Faculty of Science, University of Aarhus.

Web 1.0 annotation

  • mprox - a second layer of consciousness:

    mprox is not a 'product' - we dont give a shit about business!
    mprox is not 'art' - we dont waste time being at the right parties talking shit about our work!
    mprox is simply an experiment.
    it is an experiment about how the web could be used for not only passive viewing of information, but active commmunication on top of (and below) this information.
    it will also be an experiment how people will develop ways to deal with these possibilities, since there is no censorship, control or administration involved.
    by using mprox, a second layer of consciousness is created on every web page you visit, that can be used to communicate, post messages, manipulate the content of the page or transform the web page into an art object. possibilities are unlimited and uncontrollable due to an easily expandable "plugin"-system.

    [very old: “v0.3, 2000/03/22”]

Knowledge retrieval

  • YaCy – p2p-based distributed Web Search Engine:

    The YACY project is a new approach to build a p2p-based Web indexing network.
    * Crawl your own pages or start distributed crawling
    * Search your own or the global index
    * Built-in caching http proxy, but usage of the proxy is not a requisite
    * Indexing benefits from the proxy cache; private information is not stored or indexed
    * Filter unwanted content like ad- or spyware; share your web-blacklist with other peers
    * Extension to DNS: use your peer name as domain name!
    * Easy to install! No additional database required!
    * No central server!
    * GPL'ed, freeware

    [active: “The latest YaCy-release is 0.37” (on “20050502”) – GPL]

    Very clear architecture, explained on the technology webpage.

This is the first entry on this subject. More to come.

Posted by Jean-Philippe on May 25, 2005 9 Comments, 1650 TrackBacks

April 11, 2005

Structured graphics, diagramming, graphs and networks in Java

After my previous entries MindManager-like in Java and Knowledge Visualization Applets in Java, here come my notes about and around:

The Network Analysis And Visualization on SourceForge offers an entry to the world of graphs and networks.

To learn more and find algorithms, the InfoVis Cyberinfrastructure is a good start: “This page provides pointers to commonly used data analysis and visualization algorithms. An 'IVC Software Framework' was implemented to facilitate the easy integration of diverse software packages and their menu driven usage.”

Two labs have developped many good software and libraries:

Another hot spot is the aesthetics + computation group of the MIT Media Lab, with people like Ben Fry (Master's Thesis: organic information design). If you are interested in digital visual design, have a look at the Processing programming language: “ Processing is a programming language and environment built for the electronic arts and visual design communities. It is created to teach fundamentals of computer programming within a visual context and to serve as a software sketchbook. It is used by students, artists, designers, architects, and researchers for learning, prototyping, and production.”.

Other interesting (but old) webpages:

Structured graphics

Structured 2.5D graphics

  • Piccolo (Jazz): “Piccolo is a toolkit that supports the development of 2D structured graphics programs, in general, and Zoomable User Interfaces (ZUIs), in particular. A ZUI is a new kind of interface that presents a huge canvas of information on a traditional computer display by letting the user smoothly zoom in, to get more detailed information, and zoom out for an overview. We use a "scene-graph" model that is common to 3D environments. Basically, this means that Piccolo maintains a hierarchal structure of objects and cameras, allowing the application developer to orient, group and manipulate objects in meaningful ways.” [really cool]
  • ZVTM - Zoomable Visual Transformation Machine: “The ZVTM is a Zoomable User Interface (ZUI) toolkit implemented in Java, designed to ease the task of creating complex visual editors in which large amounts of objects have to be displayed, or which contain complex geometrical shapes that need to be animated. It is based on the metaphor of universes that can be observed through smart movable/zoomable cameras, and offers features such as perceptual continuity in object animations and camera movements, which should make the end-user's overall experience more pleasing. The ZVTM features a graphical object model that makes the task of creating, modifying and animating graphical entities easier, allows the definition of custom shapes, all through a simple API. The ZVTM also features smooth zooming capabilities (2.5D/zoomable user interface), multiple independent layers inside a single viewport, multi-threaded views, and support for exporting SVG documents.”

Data oriented structured 2D graphics

  • The InfoVis Toolkit: “The InfoVis Toolkit is a Interactive Graphics Toolkit written in Java to ease the development of Information Visualization applications and components. The main characteristics of the InfoVis Toolkit are: unified data structure (the base data structure is a table of columns. Columns contain objects of homogeneous types, such as integers or strings. Trees and Graphs are derived from Tables); small memory footprint (using homogeneous columns instead of compound types improves dramatically the memory required to store large tables, trees or graphs, and generally the time to manage them); unified set of interactive components (interactive filtering -a.k.a. dynamic queries- can be performed with the same control objects and components regardless of the data structure, simplifying the reuse of existing components and the design of generic ones); fast (the InfoVis Toolkit can use accelerated graphics provided by Agile2D, an implementations of Java2D based on the OpenGL API for hardware accelerated graphics. On machine with hardware acceleration, some visualizations redisplay 100 times faster than with the standard Java2D implementation); extensible (the InfoVis Toolkit is meant to incorporate new information visualization techniques and is distributed with the full sources and with a very liberal licence - it could be a base for student projects, reseach projects or commercial products).”
  • prefuse: “prefuse is a user interface toolkit for building highly interactive visualizations of structured and unstructured data. This includes any form of data that can be represented as a set of entities (or nodes) possibly connected by any number of relations (or edges). Examples of data supported by prefuse include hierarchies (organization charts, taxonomies, file systems), networks (computer networks, social networks, web site linkage) and even non-connected collections of data (timelines, scatterplots). Using this toolkit, developers can create responsive, animated graphical interfaces for visualizing, exploring, and manipulating these various forms of data.”

Shape oriented structured 2D graphics

  • Eclipse Graphical Editing Framework (GEF): “The Graphical Editing Framework (GEF) allows developers to take an existing application model and quickly create a rich graphical editor.” [for and around Eclipse]
  • JHotDraw as Open-Source Project: “JHotDraw is a Java GUI framework for technical and structured Graphics. It has been developed as a "design exercise" but is already quite powerful. Its design relies heavily on some well-known design patterns. JHotDraw's original authors have been Erich Gamma and Thomas Eggenschwiler.” [“Last update: 07.10.2004”]

    The original HotDraw can be found on the HotDraw Home Page, and what seems to be an older implementation of the concept by RoleModel Software on the Drawlets webpage.

  • Mica Graphics Framework Classic: “Mica is a 2D, high-level, full-featured, object-oriented, hierarchical, structured, resolution-independent, mixed graphics and user interface widget library with multiple levels of drawing abstraction. [...] Mica has complete support for advanced features such as infinite undo/redo, zoom and pan, network graph layouts and interactive graph templates, connections and connection points, annotations and annotation points, event handling and action percolation, layers and layer tabs, arrows, shadows, cut/copy/paste to/from clipboard, multiple printable pages, postscript, jpeg, and pdf output, rulers, toolhints, status bars, tool bars, default save/load to/from ASCII files, complete working network and diagramming editors, and more... ” [“April 7, 2005” – too monolithic cos complete UI framework]
  • Vector Visuals: “Vector Visuals provides an easy-to-use, object-based API for creating and manipulating Java2D-rendered shapes and images. It features object embedding, dynamic connectors, and multithreaded task support.” [no info]
  • Diva: “Diva is an architecture for visualizing and interacting with dynamic information spaces. In Diva, visualizations are built by hooking together software components which generate, supply, filter, and display information. The current release focuses on Diva's canvas and graph visualization infrastructure.” [seems like advanced sketching]
  • Ezd (“easy-to-use structured graphics for Java”): “Digital Equipment Corporation's Western Research Lab is proud to offer an alternative: an EzdView class that extends the Java AWT Canvas class to provide easy-to-use structured graphics. A program "draws" by creating glyphs and adding them to a drawing. The program displays the drawing by associating it with a view which maps the drawing onto a portion of the display screen. A drawing may be displayed in multiple views, and a view may display multiple drawings. Once the view(s) are defined, the program no longer concerns itself with the display. Ezd takes responsibility for all display management. When the screen area under a view needs to be updated, the view draws each object in the order that it appears in the drawing, translating the drawing's coordinates to screen coordinates. It draws using the "painter's algorithm", i.e. colors are opaque and later drawn objects may obscure earlier drawn objects. As the user program executes, it changes what's on the display by making changes to glyphs, drawings, or views. Glyphs may change how they wish to be drawn, or be added, deleted, or rearranged in a drawing. The mappings of drawings onto views may also be changed. With no user intervention, Ezd automatically reflects these changes on the display. When mouse or keyboard events occur, Ezd identifies the effected glyph and calls the glyph's event handling method.” [“Last updated: 8 December 1997”]
  • subArtic: “SubArctic is not yet another AWT widget set. It is a complete, full- functioned, industrial strength toolkit designed to be used for all your user interface needs. SubArctic is based on 10 years of toolkit research and is designed to offer the advanced interface techniques needed to go beyond static interfaces and simple collections of widgets. SubArctic is highly extensible and supports a number of sophisticated effects not available in other toolkits (and provides the basic infrastructure to build much more). Specific features include: support for sophisticated drawing effects which can be applied to all interface elements; animation support based on a high level path model with controlled timing and support for effects such as slow-in/slow-out, anticipation and follow through, etc.; facilities for semantic snapping interactions; provisions for standard and custom interactor (widget) styles that can be switched dynamically; support for semantic lens interactions; a full-functioned interactor library with the customary buttons, check-boxes, sliders, etc., as well as more sophisticated interface composition techniques supporting dragging, snapping, lenses, and animation, all of which can be mixed with and added to other techniques; new techniques to support interactive debugging of interfaces; a built-in, efficient, and easy to use constraint evaluator for support of flexible dynamic layout, a well developed, and carefully designed infrastructure for extensibility at all levels, and; the system is available free.” [“Last revision: January 17 1997”]

Graphs or networks

Format for graphs

The Graphviz galaxy

  • ZGRViewer: “ZGRViewer is a 2.5D graph visualizer implemented in Java and based upon the Zoomable Visual Transformation Machine. It is specifically aimed at displaying graphs expressed using the DOT language from AT&T GraphViz and processed by programs dot or neato. ZGRViewer is designed to handle large graphs, and offers a zoomable user interface (ZUI), which enables smooth zooming and easy navigation in the visualized structure. ”
  • Grappa: “Grappa is a Java graph drawing package that simplifies the inclusion of graph display and manipulation capabilities within Java applications and applets. It has a good number of useful features built into it, but is also extensible. Grappa can be thought of as a port of a subset of GraphViz to Java.”
  • graphopt: “This program optimizes graph layouts. That's pretty much it.” [quite old: “v0.4.1 - 2003-05-06 17:00” – C++, for Linux?]

The RDF's corner

  • IsaViz - A Visual Authoring Tool for RDF: “IsaViz is a visual environment for browsing and authoring RDF models represented as graphs. It features: a 2.5D user interface allowing smooth zooming and navigation in the graph; creation and editing of graphs by drawing ellipses, boxes and arcs; RDF/XML, Notation 3 and N-Triple import; RDF/XML, Notation 3 and N-Triple export, but also SVG and PNG export.”

    As explained in Graph Stylesheets (GSS) in IsaViz, IsaViz makes use of the GSS styling language.

The JGraph galaxy

The best graphing suite to handle graphs in the way of Microsoft Visio (diagramming).

  • JGraph Swing Component: “With JGraph, users from highly-technical to very non-technical are able to display and edit complex information without the need to understand the underlying complexity. JGraph can be integrated into custom applications and websites and allows to use and interact with any data model, from XML files to databases or other native systems. Written in 100% pure Java, JGraph provides key features such as zooming, cell collapsing/expanding, undo, full event-handling, drag and drop support and much more.” [LGPL]
  • JGraph Layout Pro: “JGraph Layout Pro is the next generation of Java Graph Layout engine designed for optimal performance with the JGraph core. JGraph Layout Pro has a flexibility and simple design, enabling you to use circular, tree and force-directed layouts with ease in your JGraph application. Layout Pro comes with a developers guide and an example applet that demonstrates features such as auto-layout, collapsing/expanding of grouped cells, graph morphing and selective layouting of sub-graphs.” [not available for free]
  • JGraphpad Application Toolkit: “JGraphpad is a powerful, free diagram editor based on JGraph. It is currently available in English, French and German. With JGraphpad, you can create flow charts, maps, UML diagrams, and many other diagrams. JGraphpad is provided as an example for the JGraph Swing component.” [GPL]
  • JGraphT: “JGraphT is a free Java graph library that provides mathematical graph-theory objects and algorithms. JGraphT supports various types of graphs including: directed and undirected graphs; graphs with weighted / unweighted / labeled or any user-defined edges; various edge multiplicity options, including: simple-graphs, multigraphs, pseudographs; unmodifiable graphs - allow modules to provide "read-only" access to internal graphs; listenable graphs - allow external listeners to track modification events; subgraphs graphs that are auto-updating subgraph views on other graphs; all compositions of above graphs.”

The TouchGraph galaxy

In the spirit of inxight and Thinkmap.

  • Example from the development page:
    • KAON OI-modeler: “OI-modeler is a tool for ontology creation and maintenance. The goal of the tool is to allow scalability for editing large ontologies, as well as to incorporate some usability issues related to ontology management. The graph layout algorithms in OI-modeler are based on an open-source TouchGraph library.” [“Last modified 10-12-2002 09:13 AM” – “Package kaon - Version KAON 1.2.9 - Date April 4, 2005”];
    • GraphLayout;
    • WikiBrowser;
    • LinkBrowser;
    • More applications in the news section.
  • HyperGraph: “HyperGraph is an open source project which provides java code to work with hyperbolic geometry and especially with hyperbolic trees. It provides a very extensible api to visualize hyperbolic geometry, to handle graphs and to layout hyperbolic trees. As soon as you want to look at large data volume that has a hierarchical structure, you will find hyperbolic trees very useful - they show more data than standard tree representations like your favorite explorer, and they have a great look and feel.” [quite useful with ToughGraph]

The two main commercial diagramming systems

Other graph visualization and layout systems

  • GEF - Java Graph Editing Framework: “The goal of the GEF project is to build a graph editing library that can be used to construct many, high-quality graph editing applications. Some of GEF's features are: a simple, concrete design that makes the framework easy to understand and extend; Node-Port-Edge graph model that is powerful enough for the vast majority of connected graph applications; Model-View-Controller design based on the Swing Java UI library makes GEF able to act as a UI to existing data structures, and also minimizing learning time for developers familiar with Swing; high-quality user interactions for moving, resizing, reshaping, etc. GEF also supports several novel interactions such as the broom alignment tool and selection-action-buttons; generic properties sheet based on JavaBeans introspection; XML-based file formats based on the PGML standard (soon to support SVG).” [used by ArgoUML]
  • JGraphEd: “a Java Graph Editing application and Graph Drawing framework. JGraphEd was designed to allow users to easily create graphs step by step by adding or removing, or modifying nodes or edges. JGraphEd is modular by design and a variety of standalone and interdependent algorithms have been provided for manipulating or visualizing graphs. JGraphEd was also designed with extensibility in mind, in order to allow developers to quickly and painlessly add their own algorithms to the included library.” [“Last Modified: May 4, 2004”]
  • GVF - The Graph Visualization Framework: “The Graph Visualization Framework is a set of design patterns and approaches that can serve as an example for applications that either manipulate graph structures or visualize them. The libraries implement several basic modules for input, graph management, property management, layout, and rendering. Some modules could be made to operate independently with some modification. For example, the graph management module can, in principle, be used as the data structure part of a program which doesn't necessarily use visualization.” [“1.36 - 2004-03-03 16:00”]
  • JDigraph: “JDigraph is a Java library for representing and working with directed graphs and paths. The API is patterned after the Java Collections API.” [no info]
  • The VGJ Graph Drawing Tool [no info]
  • NetEditor (The Arakhnê Network Editor): “The Arakhnê Network Editor (NetEditor) is a free Java 2 component that permits to edit and show connected-graphs (or networks). NetEditor is only composed by a drawing area in which you can draw nodes and edges. NetEditor supports the following features: graphical editing of the graph structure; depth levels for nodes and egdes; can undoing and redoing user actions; alignement of nodes and egdes; clipboard management; exporting into graphical formats: GIF, PPM, PNG...; exporting into vectorial formats: Postscript, Xfig...; XML save and load.” [small and quite old: “0.6 (2003/05/23)”]
  • OpenJGraph - Java Graph and Graph Drawing Project: “ The goal of this project is to create an opensource Java library, licensed under LGPL, to create and manipulate graphs. Current features include: directed, undirected, directed acylic graphs, and weighted graphs; simple graph algorithms, such as graph traversal, minimum spanning tree and shortest path spanning trees for weighted graphs; basic graph drawing, including straight line and orthogonal graph drawing . however, more work still needs to be done here; user interaction, such as creating and removing a vertex, creating and removing an edge, dragging a vertex, and changing some of the vertex and edge properties.” [small and old: “$Date: 2002/10/05 09:57:28 $”]
  • Grace: “Grace is a generator for direct manipulation graph editors in Java. It can be used for any graph-like datastructure of your application.” [small and old: “ most recently modified at 10/18/1999 13:24 UTC”]
  • VGJ (Visualizing Graphs with Java): “VGJ, Visualizing Graphs with Java, is a tool for graph drawing and graph layout. Graphs can be input into VGJ in two ways: with a textual description (GML), or through a drawing the user creates using our graph editor. The user can then select an algorithm to layout the graph in an organized and (hopefully) aesthetically pleasing way.” [dead: “The current version is 1.03, released on 4/20/98. Development of VGJ at this (the original) site has stopped.”]

Large graphs and networks

  • JUNG - Java Universal Network/Graph Framework: “ JUNG — the Java Universal Network/Graph Framework--is a software library that provides a common and extendible language for the modeling, analysis, and visualization of data that can be represented as a graph or network. It is written in Java, which allows JUNG-based applications to make use of the extensive built-in capabilities of the Java API, as well as those of other existing third-party Java libraries. The JUNG architecture is designed to support a variety of representations of entities and their relations, such as directed and undirected graphs, multi-modal graphs, graphs with parallel edges, and hypergraphs. It provides a mechanism for annotating graphs, entities, and relations with metadata. This facilitates the creation of analytic tools for complex data sets that can examine the relations between entities as well as the metadata attached to each entity and relation. The current distribution of JUNG includes implementations of a number of algorithms from graph theory, data mining, and social network analysis, such as routines for clustering, decomposition, optimization, random graph generation, statistical analysis, and calculation of network distances, flows, and importance measures (centrality, PageRank, HITS, etc.). JUNG also provides a visualization framework that makes it easy to construct tools for the interactive exploration of network data. Users can use one of the layout algorithms provided, or use the framework to create their own custom layouts. In addition, filtering mechanisms are provided which allow users to focus their attention, or their algorithms, on specific portions of the graph.” [“26 February 2005: JUNG 1.5.4 released”]
  • GUESS: The Graph Exploration System: “A database driven system that allows nodes and edges to include attributes beyond basic display features (we support continuous, categorical, and binary attributes). GUESS lets you represent those features in a database and through a powerful interpreted, embedded (python-based) language which allows you to easily manipulate the graph based on those features and create new programs. [...] A zoomable interface to large graphs allowing for the visualization of graphs and networks on an infinite plane with infinite (smooth) zoom. Try the applet to get a sense of this. Save and animate graph states, see the movie pages for an example. Complete cinematographic control over nodes, edges, and the camera for more powerful dynamic graph visualization. Export and capture EPS/PDF/JPG/SVG/... Various layout algorithms and graph analysis commands. Talks to R to access pre-implemented statistical analysis routines. ” [Hewlett-Packard Labs]
  • Walrus - Graph Visualization Tool: “Walrus is a tool for interactively visualizing large directed graphs in three-dimensional space. By employing a fisheye-like distortion, it provides a display that simultaneously shows local detail and the global context.” [and many CAIDA Tools – GPL]
  • Graph INterface librarY a.k.a. GINY: “GINY implements a very innovative system for sub-graphing and allows for stunning visuals. GINY is open source, provides a number of layout algorithms, and is designed to be a very intuitive API.” [“The back-end of the package is currently using libraries from the CERN Colt project.” – “The visual side of GINY is implemented making extensive use of Piccolo.” – “A number of our layouts are derived from implementations from the JUNG project.” – “incorporated into Cytoscape”]
  • Mascopt: “The main objective of the Mascopt (Mascotte Optimization) project is to provide a set of tools for network optimization problems. Examples of problems are routing, grooming, survivability, or virtual network design. Mascopt will help implementing a solution to such problems by providing a data model of the network and the demands, libraries to handle networks and graphs, and ready to use implementation of existing algorithms or linear programs (e.g integral multicommodity flow).” [“Last Published: Thu, 17 Feb 2005 17:16:12 GMT”]
  • AGF - Algorithms for Graph Drawing: “ AGD offers a broad range of existing algorithms for two-dimensional graph drawing and tools for implementing new algorithms.rdquo; [quite old: “Last modification: Donnerstag, 04-Dez-2003 17:31:49 CET”]
  • Rox [in brazilian]
  • Pajek: “Program for Large Network Analysis” [Win32 but runs on Linux with Wine – “ Test version of Pajek 1.04 for Windows 32 (March 31, 2005 [...])”]
  • WilmaScope: “WilmaScope is a Java3D application which creates real time 3d animations of dynamic graph structures.” [quite old: “17th October 2003 - New release of WilmaScope (V2.2)”]
  • vrmlgraph (“3-D VRML graph drawing package in Java”): “An object oriented Java program available with source code for creating 3-D graph representations, complete with example main methods and extensions for performing the following: storage of Nodes and Edges of a 3-D graph in a single GraphData object; parsing descriptions of connected nodes from a graph program text file and populating a GraphData object with them. The nodes do not need to specify their x,y,z locations; performing 3-D spring embedding calculations to produce (often) aesthetically pleasing graphs from any input; center any 3-D graph about the origin; output a text file that describes the currently stored 3-D graph; output a VRML file that shows a 3-D view of the current graph.” [old: “April 5, 2001”]
  • Tulip: “a library for huge graph” [Linux? with C++ – “06-Jan-2004> Tulip version 2.0.1 released”]
  • P.I.G.A.L.E. - Public Implementation of a Graph Algorithm Library and Editor: “ We develop a graph editor and a C++ algorithm library essentially concerned with planar graphs. The editor is particularly intended for graph theoretical research.” [Linux? With C++ – “Last modified: Sat Dec 18 18:03:37 CET 2004” – interesting for algorithms]
  • Graphlet: “Graphscript, a Tcl/Tk based, extensible programming language for graph algorithms with user interfaces. Graphscript is implemented in C++ and GTL, an STL based library for programming with graphs.” [Win32 with C++ and Tcl/Tk – old: “Last modified Sunday, August 24, 1999 11:25 AM”]
  • Social Networks Visualiser for Linux “Social Network Visualiser (SocNetV) is a GNU program which intends to help anyone, using Linux OS, to visualise graphically and play with social networks.” [Linux with C++? – “v0.37, Feb, 23 2005”]

Charting

  • JFreeChart: “ JFreeChart is a free Java class library for generating charts, including: pie charts (2D and 3D); bar charts (regular and stacked, with an optional 3D effect); line and area charts; scatter plots and bubble charts; time series, high/low/open/close charts and candle stick charts; combination charts; Pareto charts; Gantt charts; wind plots, meter charts and symbol charts; wafer map charts. Other features offered by JFreeChart: complete source code is included, under the terms of the GNU Lesser General Public Licence; access to data from any source via dataset interfaces; support for multiple secondary axes and datasets; tooltips, zooming, printing; direct export to PNG and JPEG; export to PDF via iText and SVG via Batik (both described in the JFreeChart Developer Guide); support for servlets, JSP (thanks to Cewolf), applets or client applications; comprehensive Javadocs.” [LGPL]

Geospacial data

  • OpenMap - Open Systems Mapping Technology: “OpenMapTM is a Java BeansTM based toolkit for building applications and applets needing geographic information. Using OpenMap components, you can access data from legacy applications, in-place, in a distributed setting. At its core, OpenMap is a set of Swing components that understand geographic coordinates. These components help you show map data, and help you handle user input events to manipulate that data.”
  • GeoVISTA Studio: “GeoVISTA Studio is an open software development environment designed for geospatial data. Studio is a programming-free environment that allows users to quickly build applications for geocomputation and geographic visualization.”

Handling of images (file formats)

  • Batik: “ Batik is a Java technology based toolkit for applications or applets that want to use images in the Scalable Vector Graphics (SVG) format for various purposes, such as viewing, generation or manipulation.” [Apache]
  • VectorGraphics: “The VectorGraphics package of FreeHEP Java Library enables any Java program to export to a variety of vector graphics formats as well as bitmap image formats. Among the vector formats are PostScript, PDF, EMF, SVF, SWF and CGM, while the image formats include GIF, PNG, JPG and PPM.” [a library]

GUI toolkit

  • Eclipse SWT - Standard Widget Toolkit: “SWT is a widget toolkit for Java designed to provide efficient, portable access to the user-interface facilities of the operating systems on which it is implemented.” [around Eclipse]
  • XUI: “XUI is a Java and XML framework for building rich client, desktop and mobile applications. The framework can save you up to 60% of the code typically needed to build an application. The result is real savings in development time and maintenance costs and greater stability.” [“XUI 1.0.4 has been promoted to the latest stable version as of 6th January 2005.&rduqo;]
  • Buoy - A Better User Interface Toolkit: “Buoy is a library for creating user interfaces in Java programs. It is built on top of Swing, but provides a completely new set of classes to represent graphical components. It offers many advantages over using Swing directly, including: a much simpler, cleaner, and more consistent API; a better mechanism for laying out interface components; a far more powerful event handling mechanism, which is based on dynamic binding of arbitrary methods as event listeners; built in support for serializing user interfaces as XML, then reconstructing them again. ” [“The current version of Buoy is 1.4, released April 9, 2005.”]
  • Swank - Open-Source, Scriptable GUI Tool Kit: “Swank is a graphical user interface toolkit implemented entirely in JAVAtm. Swank provides the companion to Jacl, the Tcl interpreter implemented in JAVAtm. Thus, Jacl/Swank forms the analogous pair to Tcl/Tk and can be used to rapidly script user interfaces. The goal of the design of Swank is to provide a toolkit that will be familiar to Tk users, provide a reasonable level of backwards compatibility with Tk, and provide access to the generally greater, as compared to Tk, feature level of the Swing components.” [a kind of Tcl/Tk couple in Java]
  • japi - java application programming interface: “japi is an open source free software GUI toolkit, which makes it easy to develop platform independent applications. Written in JAVA and C, japi provides the JAVA AWT Toolkit to non object oriented Languages like C, Fortran, Pascal and even Basic. ” [“26 Feb 2003 (V1.0.6)”]

Posted by Jean-Philippe on April 11, 2005 34 Comments, 804 TrackBacks

December 30, 2004

Knowledge Visualization Applets in Java

Following my entry about mindmapping tools,

    Knowledge Visualization Applets:

  • Thinkmap (“Thinkmap is designed to improve the ability of users to browse through complex information. It allows for interactive, dynamic filtering and the display of structured information without overwhelming the user or undermining the depth of the information resource.”);
  • TheBrain Technologies Corporation (“Collapsing the Time to Knowledge”);
  • Inxight Software (“Discover the True Value of Information”);
  • TouchGraph (open source);
  • at last, The Ceryle Project, which is more a knowledge management application than an applet (“Ceryle is a free tool to help you get organized. If you use a lot of post it notes, are a writer, journalist, researcher, student, or anyone compiling a lot of information for a project, maybe just trying to organize your recipes, bookmarks, or your MP3 collection, Ceryle is designed to assist you in keeping track of things. Ceryle includes features to help you store, find and even visualize your information, using what is called a graph visualization. Those same graphs also enable you to create structures for your documents, see relationships between ideas, and even generate composite documents based on those structures, like building a book from its chapters or a screenplay from its scenes.”).

(The first three via Swing Sightings Volume 4Knowledge Visualization Applets, the last from Keith Devens.com on Wednesday, December 29, 2004).

Posted by Jean-Philippe on December 30, 2004 1 Comments, 1238 TrackBacks

Databinding in Java

There is an interesting poll on Manageability: What's the Best Java Tool for XML Binding?. In the results I discovered XStream, “a simple library to serialize objects to XML and back again”, which seems to be very lightweight, whereas I personally use the more complete XMLBeans from the Apache XML Project (via Erik's Linkblog: Wednesday, December 29, 2004 @687).

Posted by Jean-Philippe on December 30, 2004 1 Comments, 179 TrackBacks

Resolving dependencies, building and releasing in Java

Vincent Massol wrote an interesting article called Unbreakable builds, or: what to do without build awareness in a team? (comment on Euxx: Self healing builds and responsibilities, via Erik's Linkblog: Wednesday, December 29, 2004 @687).

Posted by Jean-Philippe on December 30, 2004 0 Comments, 179 TrackBacks

Reading and writing PowerPoint files with Java

Yes! you can now manipulate Microsoft PowerPoint files in Java, thanks to Tonic Systems, but it is not free:

If you want a free solution, you might have a look at xlhtml or wrap the OpenOffice.org PowerPoint importer and exporter with JNI (the free complete book from Sun: Java Native Interface: Programmer's Guide and Specification).

Posted by Jean-Philippe on December 30, 2004 0 Comments, 143 TrackBacks

Online documentation for Java libraries

  • JDocs (“a comprehensive online resource for Java API documentation”);
  • JSourcery (“the ultimate marriage of Javadoc documentation and source code”).

Posted by Jean-Philippe on December 30, 2004 3 Comments, 218 TrackBacks

MindManager-like in Java

    Mindjet MindManager-like in Java:
  • FreeMind – “free mind mapping software” (“a premier free mind-mapping software written in Java”);
  • AutoFocus – “Your Desktop Exploration Tool” (not open source).

Have a look at Alternatives to using FreeMind for other software.

Posted by Jean-Philippe on December 30, 2004 1 Comments, 196 TrackBacks

Free project management tools in Java

Posted by Jean-Philippe on December 30, 2004 0 Comments, 178 TrackBacks

Another library in Java for Regular Expressions

Maybe the last, but not the least: the project jrexx – “automaton based regular expression API for Java” and its beautiful jrexx-Lab – “laboratory for regular expression analysis”.

Posted by Jean-Philippe on December 30, 2004 0 Comments, 183 TrackBacks

December 23, 2004

Rediscovery of Regular Expressions

This collection of links is brought to you in order to rediscover with me the fabulous world of regular expressions. I am currently using a lot of regexp in the typographic engine I am building. Books by O'Reilly:

On Perldoc.com, a huge site with Perl documentation:

Websites:

Regular Expressions testing tools:

Java packages (excepts from regex.info's Links to Java Regex Packages):

PHP can use Perl Compatible Regular Expressions (PCRE).

More: the e-x-c-e-l-l-e-n-t article Marshall McLuhan vs. Marshalling Regular Expressions by Andy Oram.

“Regular expressions extend the reach of text, and therefore inexorably change how we sense the text.”

Posted by Jean-Philippe on December 23, 2004 1 Comments, 171 TrackBacks

March 06, 2004

Wandering in IoC lands

Someone seems to have listened to me: in Draft: Introduction to (IoC) Container Internals (LSD::RELOAD), Leo Simons introduces his new paper, an Introduction to (IoC) Container Internals. I've just read the beginning and must confess I understood nothing... Same result with PicoContainer Inversion of Control and IoC Types (there are many things to read there). After these sad news for my intellect, I took a look at the Apache corner: HiveMind Inversion of Control and Avalon IOC Patterns. Same noisy result. The IoC Introduction on Javangelist was just a bit clearer (with what seems to be a good pros and cons part)...
Well, I must be dumb, or theses guys are not pedagogist. My evil plan: read Inversion of Control Containers and the Dependency Injection pattern by Martin Fowler and The Dependency Inversion Principle from Object Mentor, Inc..

Posted by Jean-Philippe on March 06, 2004 13 Comments, 475 TrackBacks

March 04, 2004

Voilà pourquoi je suis nul en informatique

C'est vrai ça, moi je croyais souvent ce que disais les professeurs, j'étais un élève sage, etc. Ce n'est pas avec le "voilà ce que vous devez savoir sur Java" que je pouvais m'en sortir dans ce monde cruel. Au final je suis une bite et si quelqu'un peut me trouver un bon livre sur la pattern IoC (Inversion of Control), je suis preneur!
Next trimester I'll be following the "Introduction to programming with Java" class that is mandatory for all physics students. Oh, joy.

"In this course, students are introduced to a modern, visual development and programming environment (they dare call JBuilder modern!) based on the programming language Java. They will also learn to think algorithmically (no, its not a word in Dutch either) by analysis of non-trivial problems (from what I've heard, add some missing pieces into a GUI cd database). Subjects include elementary object-oriented concepts such as classes, instances, methods and attributes (they mean fields here I presume)."

All lectures and assignments are mandatory. Boy, am I looking forward to biting my tongue as a CompSci student reviews my code...

(On LSD::RELOAD: Mandatory java 101...oh, c'mon!, and the follow-up here: Mandatory java 101...not really)

Posted by Jean-Philippe on March 04, 2004 9 Comments, 0 TrackBacks

August 24, 2003

I am alive!

Yes ;)

Many things to come, starting next week (-end?)...

After my month of Java lazy coding, the W3Framework (Wittycube Framework) is going on very fast and the first application using it, Atlas, is close to be running completely (in fact what is running is what will become the first plugin set available). Atlas is an incredible application to handle “Manufactured Serendipity”.
The core parts of Atlas will serve as foundations for the W3Platform (the high level of the Wittycube Kromacube-OS), actually in planning stage (hey! we do not build a new “Operating System” from scratch like that ;)...).

Posted by Jean-Philippe on August 24, 2003 16 Comments, 188 TrackBacks

June 04, 2003

Using Eclipse Java Development Tools (JDT)

A very good introduction to Eclipse JDT (“java development tools subproject”) on InformIT: Using Java Development Tools in Eclipse.
In fact, a sample chapter (PDF, 52 pages) from the book The Java Developer's Guide to Eclipse, by Sherry Shavor, Jim D'Anjou, Dan Kehn, Scott Fairbrother, John Kellerman, Pat McCarthy – Paperback: 896 pages; Publisher: Addison Wesley Professional; 1st edition (May 19, 2003); ISBN: 0321159640.

(Via OSNews.com: Understanding C++ Program Structure)

Posted by Jean-Philippe on June 04, 2003 15 Comments, 4068 TrackBacks

“Powerful surprises”

On LinuxWorld.com: A first look at Ximian Desktop 2:


Next, I checked to see which games Ximian had installed for me. I discovered by clicking on More at the bottom of the list that Ximian had found and added the Devastation and Return to Castle Wolfenstine betas I had previously installed to the menu. It's good to have mission-critical apps just a click away.

Nota: I did not find this story very good, cos the author does not make a difference between Ximian Desktop 2 and the vanilla GNOME 2. Ximian has made a lot of work to improve GNOME 2, and this article does not reflect these hacks (see Interview with Ximian's Nat Friedman on OSNews.com for a better review from one of the creator of this desktop).

(Via OSNews.com: A First Look at Ximian Desktop 2)

Posted by Jean-Philippe on June 04, 2003 17 Comments, 241 TrackBacks

June 01, 2003

A bit of history

Two links to learn:


Posted by Jean-Philippe on June 01, 2003 24 Comments, 167 TrackBacks

March 31, 2003

Java rich clients against HTML poor clients

Via Erik's Weblog (Monday, March 31, 2003 [@469]):
On "The Mountain of Worthless Information": “Effective Enterprise (Presentation): Don't forget the rich client”:

Ted Neward explains us, after showing all HTML disadvantages for building “browser-based HTML "thin client" application”, how rich client technologies can help developers in certain web-centric applications: applets, JNLP (Java Network Launch Protocol) aka Java Webstart in J2SE, URLClassLoader.

Posted by Jean-Philippe on March 31, 2003 24 Comments, 3075 TrackBacks

Entries on this page

Entries by category

Entries by month