We wrote an article for the book “Learning Communities. Das Internet als neuer Lern- und Wissensraum” published by Christina Schachtner, Angelika Höber at Campus. More info about the book can be found at Campus and Amazon.
Our contribution about “Visualising information flows in self organising knowledge networks” and can be found [here]. (in German)

Last Thursday we have been guests of Gerhard Dirmoser, who showed us his impressing collection of diagrams and his diagrammatic library. Gerhard is one of the leading experts in the field of diagrammatic and is devoting his work to the development of a new epistemological approach to describe and order diagrams. This approach is outstanding, because it aims to work finally without textual description, only on diagrammatic relations. Therefore probably the word “description” is inappropriate at all, because in Gerhards studio you realize that his research process consists in ordering, relating and placing objects, very similar to Aby Warburgs Mnemosyne. (see also german wikepedia entry on Warburg)
Aby Warburg revolutionised art history by introducing replications for didactic purposes. Nowadays image processing and graph engines can produce new experiences of exploring art. Gerhard Dirmosers and Dietmar Offenhuber project SemaSpace is exactly about the question of exploring semantically structured data and memory spaces. Dietmar Offenhuber convincingly solved the problem of handling large amounts of nodes, even several thousands – and even if the nodes are represented by images. Here’s a short description of SemaSpace by the authors:
SemaSpace is a fast and easy to use graph editor for large knowledge networks, specially designed for the application in non technical sciences and the arts. It creates interactive graph layouts in 2d and 3d by means of a flexible algorithm. The system is powerful enough for the calculation of complex networks and can incorporate additional data such as images, sounds and full texts.
On the SemaSpace Website you will find not only the tool but also an interesting application:
“25 years of ars electronica
study conducted by Gerhard Dirmoser, contains all projects / people involved in ars electronica until 2003, based on collected material and data from the ars electronica database. original files of the study:”
But SemaSpace is more than an organised database. It represents a “space of memory” that commemorates the threads of theory and media art within the “ars electronica universum.” It can be seen in the tradition Giulio Camillos Memory Theatre (see also http://www.clausmoser.com/?p=378) (By the way Camillo is a must for interaction designers)
Dietmar is currently working on a new version of SemaSpace and Gerhard is now about to network his collection of 4000 diagrams within the graph editor. As already two thirds of the work has been done within 20 workdays it is quite obvious that it seems an appropriate way to organise large amount of image data in a reasonable time span.
There’s a lot of other work (texts, diagrams and network graphs) by Gerhard available here: http://www.servus.at/kontext/ARS/ (strongly recommended).
Special hint for us lucky Austrians: next Sunday, February 4, a whole day lecture takes place at Audi Max of Danube University Krems.
In the past much more effort has been spent on visualising email conversation than visualising blog diffusion. Though it’s a different domain blog analysis can learn a lot from work previous work done in the field of email archives.
In both domains there are huge archives that reflect our interests and conversations with others. Default functionality like ordering archives by time or author and search functionalities are proper means for finding information, but there’s almost no functionality for browsing archives in order to explore personal developments, the cyclical up and down of interests or the path of conversation between two persons.

In the following I refer to a paper by Judith Donath, Fernanda B. Viégas, and Scott Golder “Visualizing Email Content: Portraying Relationships from Conversational Histories” in which they present a prototype called “Themail”. It is based on Salton’s TFIDF algorithm which compares the frequency of a certain word in a limited time span (e.g. a month) to its frequency in the whole archive. If there’s a relative high frequency it is displayed in a larger font in order to highlight its importance. This technique was used to analyse the email communication between the owner of the mail archive and a specific conversation partner over a series of months (see Figure 1). User testing, done by the authors, demonstrated that this form of analysis shows good results in characterising developments in the personal relationship and helps in assigning hot topics to specific months. E.g. in email conversations before a wedding, words like “invitation”, “tables”; “drinks” and guests names were used more frequent. Or a travel to Asia resulted in words like “Bangkok,” “thai” and “kuala”.

Figure 2 shows that after the return the conversation turned back to programming and other usual themes of conversation. Users like to explore their archive together with friends and conversation partners, because it tells a lot about the evolution of a relationship (like from classmates to lovers)
Similar analysis could be done for Weblogs; either for characterising the archive as a whole or single months or categories. Word frequencies of a weblog could be compared to overall frequencies in the blogosphere and could be output in word lists which describe the blog’s content. Phrases would do even a better job than single words. A system like Amazon’s SIPs (statistically improbable phrases) could probably describe a blog’s content more “objectively” that a tag cloud based on the author’s personal tagging. A similar idea has been posted by Rageboy (see also David Weinberger.)

Judith Donath, elaborated the Themail concept a step further. She mapped the email conversation between six researchers over a period of 22 days. The conversation took place in preparation of Janet Abram’s and Peter Hall’s book “else/where: mapping” (Donath called her prototype “The Rhythm of Salience” and it was published in the same book.) On the x-axis she distributed the names of the researchers; the y-axis mapped the flow of time. In the whole conversation only 30 messages were exchanged; They are displayed as full text. Visualisation aims to map temporal rhythms and the patterns of interaction between the participants. Each message is represented by a white square; thin white lines between the squares show the reply structure. Important words and phrases are highlighted in a similar manner like in Themail.
As in the case of Themail the usefulness of the map depends on the user’s relation to the mapped conversation. I’m quite sure that it is very useful for the people involved but for outsiders it provides only superficial information and is only useful in a bigger size and when printed out, in order to read the details. I also doubt that the length of the messages allow for reliable content analysis like SIP, in order to highlight important phrases. – The map was handmade and highlights were made by personal estimate. Nevertheless it is an interesting approach to map information diffusion and interaction; probably also useful for the blogosphere. It’s quite obvious that in such a map only a smaller number of messages can be displayed. For a bigger number of messages other forms of visualisations – like e.g. animated maps - might do a better job.
further information:
Rhythms of Social Interaction: Messaging within a Massive Online Network
Sociologically Improbable Phrases

Blogviz, by Manuel Lima is probably the tool that comes closest to what we want to do in our MemeMapper project. After reading Manuel Lima’s thesis and after trying out blogviz itself – for both I spent several days – I won a lot of insights and had sometimes the feeling of listening to an unknown relative, who thinks in a similar way as we do. Blogviz is definitely a benchmark for our own attempts, especially in the area of visualisation and interface design.
The visual language of blogviz is very clear and appealing, but the main problem that occurs seems to be that you don’t have a first sight experience – like you do have for example in Marcos Weskamps Social Circles; Some kind of immediate light bulb effect – in Marcos map you see the central persons at first glance. But maybe this is only a question of the map’s target group
Certainly a map’s design always depend on its readers and as Manuel focuses on people with detailed interest in diffusion processes like e.g network researchers, it is legitimate to use a interface that requires some time to learn how to use it. – Marcos on the other hand provides a tool that targets ordinary mailing list members, especially new ones, who can almost immediately detect the important people, the hubs.
In our case we want to target both groups: ordinary weblog users who need a first glance information AND expert users with specific interests (like researchers or marketing managers). For us it is obvious that we need at least two different interface solutions. I personally don’t believe that there is the one and only solution for all purposes. Manuel’s thesis by showing all steps of development and different prototypes gives a perfect insight in the variety of possible solutions. By choosing finally a diagram form that he derived from a train time table by Marey, Manuel focused on a specific” view” at data by excluding others. But the perfect view always depends on the people using the map. – silly example: for a car driver there’s not much sense using a map visualising population density. – As Manuel is interested in research it would be very interesting to know what researchers think of blogviz. Finally this should not only include user testing but also the direct involvement of researchers in the development of maps and data aggregation.
Maps do not have an absolute or objective meaning in the sense that they “map” reality 1:1. Even geographical maps use different projections methods, like e.g. Mercator projection which favours northern countries in size. (They simply become bigger and therefore more important) So maps should pragmatically be seen as extensions of our senses and our intellectual abilities. We are free to choose either this or that form as long as we make its specific view clear to users. Through usability testing and listening to our users we are able to work on a map’s usefulness as a tool for thinking and communicating thoughts. We increasingly enable people in comprehending the growing complexity of network processes. As people tend to become more and more distinctive in respect to their interests and thinking there seems to be an increasing demand in different views at the “same” domain. - but is it then really the same? However the decision for one map or another is always a kind of invitation to others to share a certain view on a phenomenon and to follow a certain trail of explanation. In that sense Blogviz is definitely a strong invitation and already provides very interesting results in respect to dissemination processes.
more info about Étienne-Jules Marey:
Wikipedia about Étienne-Jules Marey. Marey became particullarly famous for his movement studies, which seems to be a related topic to the visualisation of dissemination processes. Marey was a cinema pioneer; Obviously mapping diffussion processes could benefit a lot of moving imagery or animated maps - we also used animations in our first prototype.
The relations between diffusion, movement and animation would clearly be worth a deeper investigation. Maps do allways catch a moment in time, whereas we are interested in what happens in between:
here’s a tool (a “photopgraphic gun”) marey used to catch the “moments in between moments”:

source and large version at wikipedia
Related:
Manuel Lima’s visual complexity is a great ressource for all kinds of diagrams, maps and visualisations.

I had a closer look at another classic of network visualisations: They Rule, by Josh On.
There are three factors that make this map so outstanding:
1. Data of company boards and directors - including links to all websites
2. Easy to use (some features are missing like a “back” button or a “select and delete” function
3. and probably the best of all: the possibility to save maps and show them to others.
This feature makes “they rule” a community based tool for the analysis of power and its networks. Everybody can uncover new connections among Politicians and Industry, save it and show it to the rest.

Barabasi - in his book “Linked” - mentioned a story about an actor called Bacon who was not well known during his career, but who became quite popular among network scientists. He’s a good example that in a small world (like hollywood) also less popular actors seem to be hubs; but bacon’s connectivity is more a attribute of the network and less an attribute of himself. There’s a nice visualisation of that story available at: http://www.netmapanalytics.com/demo.html.
Implicit Structure and the Dynamics of Blogspace was written byEytan Adar, Lada Adamic, Li Zhang, and Rajan Lukose, from HP Information Dynamics Lab .
Its a quite early paper (2004), and it seems as if its authors had started at more or less the same time (Spring 2003) as we did the first Blogosphere Map prototype. Whereas we were focusing on the aesthetics of diffusion mapping, the IDL focused clearly on its analytics. The work done is quite impressive as it poses for the first time the relevant questions:
How can we analyse infection pathes, when there’s no explicit information about how news (represented by an URL) travelled through the blogosphere? (because there are only a few “via” links) How can we infere Infection routes? How can we measure similarity between blogs in order to infer Infection routes.
The authors not only posed the right questions but also gave competent answers by formulating measuring methods like blog_similarity and iRank. It opens up a wide field of further research to be done like e.g. more investigation about the different weight of link_similarity of Weblogs versus text_similarity versus infection timing in respect to inferring infection routes. Probably also other methods can be found.
In any case the paper proved that there are methods to map the general collaborative structure of the blogosphere, by identifying general (i..e. more likely) trails of infection and it is possible to infer infection routes by embedding explicit links in those general trails of infection.
related:
K-means clustering
Wikipedia: Custeranalyse/k-means
K-means-demo explains the method quite obviously.
Kruskal-Wallis Test
TFIDF Scheme (deutsch),
Support Vector Machine (SVM) , try out
better introduction than wikipeda
LIBSVM — A Library for Support Vector Machines (used for this paper) Introduction for SVM-beginners by the creators of LIBSVM
Graphviz was used to generate graphs.
Zoomgraph

Vizster, developed by Jeffrey Heer and Danah Boyd, is certainly one of the best designed tools for visualising online social networks. By mapping the web of Friendster relationships Vizster proves that a visualisation can be both easy-to-use AND powerful at the same time. Its simplicity mainly derives from the fact, that users, mainly Friendster users, immediately understand the purpose of it: Looking on something very familiar, namely their own relations, from a new perspective, a bird’s perspective. Simply changing the perspective can be seen as a powerful means to initiate new cognitive processes and to amplify thinking.
I think a good map needs a shift of perspective, that on the one side needs to be rather radical (from worm’s eye to bird’s eye) but on the other side is immediately readable.
The sensation of discovering “something new” is not an attribute of the map: it is triggered by the map but takes place in the user’s mind. A good map like Vizster seems to bridge two previously unlinked nodes in our cognitive system with the result of perceiving something that we didn`t perceive before. Maps are in a way “new and attractive perceptions on the silver tablet”. – (Wow, I didn’t know that…)
In Vizster’s case the map bridges the everyday experience of having Friendster relations with their everyday ability to read maps. The egocentric view makes it even easier to understand the map’s visual grammar. A user discovers himself or herself in the middle of a web consisting out of his friends. He or she intuitively understands that the map is about him/her and his/her Friendster relations.
This immediate success is very important, because it invites to start a playful exploration of other aspects and features. Furthermore it provokes the formulation of “ad hoc interests”: Why is there a cluster? What do they have in common? Ah, they’re all students! Of the same University? Yes, indeed!
A good map like Vizster provides exactly that kind of functionality that is needed to follow your ad hoc interests. The intuitive learning of new functionality is therefore guided by ad-hoc interests and deduced by previous experiences (within Vizster and by interacting with similar maps). Vizster helps you to find your own way.
Designer often make the mistake that they want to invent something “new”, like “a revolutionary view at data”. But the revolution however takes place in the users mind and if the usage of a “new” map is not based on everyday experiences, it will fail. Good design picks user up where they are and invites them to explore something new. If the first step – understanding what the design is about, where to start… - is to big there is a high risk that you loose a lot of users.
Once the user started successfully using a map and already made first rediscoveries there’s a willingness to learn new functionality, like Vizster’s linkage view.
Related:
UCINet , “A comprehensive package for the analysis of social network data as well as other 1-mode and 2-mode data”,
JUNG, Java Universal Network/Graph Framework,
GUESS, is an exploratory data analysis and visualization tool for graphs and networks,
ContactMap, Organizing Communication in a Social Desktop,
The TouchGraph LiveJournal Browser allows one to visualize and explore social networks,
barnes-hut algorithm,
Fast algorithm for detecting community structure in networks, by Newman, M.E.J.,
Prefuse, A Java-based toolkit for building interactive information visualization applications. Vizster is driven by prefuse.
During summer holidays in Spain I found time to read Barabási’s network bible “Linked”.
I focus on personal remarks, as there is a good book review available
further infos about the book here
As an adherent of self organisation theory I welcome most of the findings presented in the book. The “new” network theory seems to provide a general tool case in order to look at a variety of systems: technical networks as the internet as well as the nervous system or social relations.
This was yet a promise of cybernetics and later in the 80ies and 90ies by different self-organisation theories. I tried myself very hard to apply self-organising theories in the field of media theory (see thesis) but looking at it now in the light of network theory I have to admit that I got stuck on a descriptive level. I often needed to refer to analogies simply because the appropriate analysis tool were not available at that time. Although analogies are very important for learning and understanding new knowledge domains they are problematic at a scientific level especially when you try to explain a domain with the vocabulary of another domain. Therefore Humberto Maturana, who coined the term “autopoiesis” in the field of neurobiology, was not very happy about the German sociologist Niklas Luhmann, who wrote a phalanx of thick books about the “autopoiesis” of social systems. Maturana criticised that it would not be an adequate application of his theory.
The main reason for the emergence of new network theories lies in the fact that the information age produces a flood of data. E-mail archives, newsgroups and the web provide a huge database that stores human communication. Until the emergence of the internet, human communication has been very ephemeral. In order to study communication or social systems you needed either to refer to rather poor written sources like books or letters, or you had to design tests, questionnaires, or other kinds of artificial research environments. Now the data is out there and you simply need to harvest it and verify your research hypothesis.
Time will tell which kind of research questions can be answered by data based network analysis. My guess is that its unique role lies in its ability to tell us interesting things about systems not only at an intellectual level but also in a form that appeals our senses. Network analysis implies also a new form of scientific aesthetics that might pave the way for new forms of holistic understanding that we urgently need to cope with the challenges of the 21 century like global warming, poverty, “terrorism” and so on. I finally will result in new forms of maps that might extend our comprehension of complex processes and our intellectual capabilities to interact with them.
In our MemeMapper project we will try to make some – hopefully bigger - steps into that direction. Therefore we appreciate requests from network researchers in order to harvest

social circles - created by Marcos Weskamp - is a good example how a good graph should be: relatively simple and visually appealing (”Appeal” seems to be important in order to make people concentrate on the content and message of a map.) It is simple as there are only a few kartographic signs:
nodes (circular discs): representing members of a certain mailing list.
size of nodes: representing the frequency of posts by that member (Marcos Weskamp calls it “Chatter Level”)
position of nodes: how often that member got a reply (Marcos Weskamp calls it ” Social Visibility”)
Links betwen nodes: represent communication in reply to a certain thread that has been started by a certain person.
All nodes together form a circular cloud.
Even at first glance - without knowing the exact meaning of the signs - one can assume that the most important nodes are the biggest ones and those postioned in the center. An intuitive and quick perception is possible and invites to a deeper exploration that reveals more and more detailled information.
More Information can be found at social cirlces startpage, in the section “how it works”.