JS Library For Image & Vector-Tiled Maps

Polymaps is a free JavaScript library for making dynamic, interactive maps in modern web browsers.

Polymaps provides speedy display of multi-zoom datasets over maps, and supports a variety of visual presentations for tiled vector data, in addition to the usual cartography from OpenStreetMap, CloudMade, Bing, and other providers of image-based web maps.

“Because Polymaps can load data at a full range of scales, itโ€™s ideal for showing information from country level on down to states, cities, neighborhoods, and individual streets.

Because Polymaps uses SVG (Scalable Vector Graphics) to display information, you can use familiar, comfortable CSS rules to define the design of your data.

And because Polymaps uses the well known spherical mercator tile format for its imagery and its data, publishing information is a snap.”

References
PolyMaps: http://polymaps.org/
QuadTiles: http://wiki.openstreetmap.org/wiki/QuadTiles

Data Visualization: The Quest for Patterns

Recently I was trying to present a way for individuals to find patterns in large amounts of data. The end result was considered successful and I thought I would share at a high level what was done.

Firstly, the data looked a tad liked this…

Note 1: this is NOT the actual data.
Note 2: C1 and C2 are where I wish to draw interest to in this example. I.e. the final outcome should be a clean visual of the all data with an intuitive focus where C1 and C2 get and send their data.

Certainly not terribly daunting, interesting, nor unfamiliar for the average office/information worker/victim ๐Ÿ™‚
Running this through Excel Services really did not produce anything useful. It hinted at associations between the data points but that was it. See for yourself…

Not the best presentation of data I’m sure you’d agree… (Even forgiving the quality of the image.)
So I ran the data around a different set of axis with a bit of custom code and got a dramatically clearer view of the columns and their content associations. At least I think it is.

Colour coding C1 and C2 made things even more clear.

A successful solution!

The value of distributed computing: The return of Markov chain Monte Carlo methods

A while back I wrote something on doing Monte Carlo simulations with Web Services and SharePoint. Halfway through I mentioned that Google Pagerank was defined by a Markov chain which in turn was an output of a process called Markov chain Monte Carlo methods. Not that it concerned me but only one person mentioned this, and at that it was a vague mentioning. Huh…

This actually is a big deal. In fact a very big deal. A multi billion dollar deal in fact, as in the case of Google PageRank. Distributed computing has the power to help us solve many things if applied correctly. The “cloud” does not. (A topic for later.) Probably the greatest hurdle in getting people back on track is that this technology has use beyond the scope of most peoples daily lives. For example…

A paper was published in PLoS last week, September 4th 2009, called “Can an Eigenvector Measure Species’ Importance for Coextinctions?” In it the authors state that “PageRank” can be applied to the study of food webs. Food webs are the complex networks of who eats whom in an ecosystem.Typically we’re at the top, unless Hollywood or very bad planning is involved. Essentially, the scientists are saying that their particular version of PageRank could be a simple way of working out which extinctions would lead to ecosystem collapse. A relatively handy thing to have these days… As every species is embedded in a complex network of relationships with others, even a single extinction can rapidly cascade into the loss of seemingly unrelated species. Investigating when this might happen using more conventional methods is complicated as even in simple ecosystems, the number of combinations exceeds the number of atoms in the universe… E.g. a typical lottery which has 8 numbers that can range between 1 and 50 has 39,062,500,000,000 different combinations…

The researchers had to tweak PageRank to it to adapt it for their ecology focused purposes.

“First of all we had to reverse the definition of the algorithm.” “In PageRank, a web page is important if important pages point to it. In our approach a species is important if it points to important species.”

They also tested against algorithms that were already in use in computational biology to find a solution to the same problem. PageRank, in its adjusted form, gave them exactly the same solution as these much more complicated algorithms.

With the right design SharePoint can be an extremely useful, and totally appropriate, interface for accessing and disseminating the inputs and outputs of such an effort. It can store and present this data with all of the requisite benefits one would expect from a collaborative platform. Certainly there’s a world of work involved in doing something like this but the key point is that the right tool for the right job mantra works here. “All” you need is:

  • IIS
  • .NET
  • SharePoint
  • PowerShell
  • Visual Studio
  • SQL
  • Skill

PerformancePoint: BI for the masses.

Microsoft wants to bring BI to the masses and based on customer feedback they decided to put it together with the next version of SharePoint. They have decided to consolidate the scorecard, dashboard and analytical capabilities from Office PerformancePoint Server into Microsoft Office SharePoint Server Enterprise and rebrand them as PerformancePoint Services for SharePoint. Overall an equally good and bad thing in my opinion.

  • Good, because it will allow people to use BI with their SharePoint environments.
  • Bad, because it will allow people to use BI with their SharePoint environments

More information here:
http://blogs.msdn.com/performancepoint/
http://office.microsoft.com/en-us/performancepoint/FX101680481033.aspx