Information World Review (IWR) Blog Information World Review (IWR) Blog A blog from www.iwr.co.uk

  « Enterprise Content Management | Main | General »

Some things are hard to find

By Phil Muncaster.

Autonomy has just announced a new e-discovery solution to increase its presence in this burgeoning space. When it bought archiving and e-discovery vendor Zantaz last year, the firm clearly signaled its intent to expand into areas related to its core competency and heritage of enterprise search. And while it`s still best known for the latter - and while it continues to make oodles of cash providing big name clients like the BBC, Boeing and Coca Cola with search technology - the e-discovery space represents a massive opportunity, as firms look to overcome the challenges presented by an increasing raft of legislation and industry regulations.

In the US, of course, e-discovery has been driven mainly by the recently updated FRCP - Federal Rules of Civil Procedure - which lay down aggressive new rules for the discovery and presentation of electronic records as evidence in US courts. E-discovery, archiving, retention; they're all bound up in this area and with strict penalties for the destruction of evidence also part of the new FRCP, the stakes have been raised significantly for firms. Not that this is just a US problem either - just as SOX was felt in other countries, so the FRCP could have an impact elsewhere, including this side of the Atlantic.

This new hosted solution features technology to accelerate the time it takes your legal bods to review electronically stored-information and classify it according to its status, and also to review the information and make an early assessment of the related case. As you'd expect from Autonomy, which I guess prides itself on being able to scale in the enterprise search space about as far and beyond what any organisation needs, the technology can process terabytes of electronically-stored info without blinking - in over 100 languages and 1000 data types. The filtering of information in such massive data sets can make it easier to gain visibility into that information, says Autonomy.

It remains to be seen whether this being a hosted solution causes any hesitation among enterprise buyers - after all, it's meant to dig out the most sensitive of sensitive documents; will firms prefer to keep this sort of capability in-house? In its defence on the security front though, Autonomy maintains that because all elements of the solution are maintained by a single vendor, this reduces the risk of data becoming lost or corrupted, and makes the whole process more auditable. Let's see what happens; e-discovery is certainly here to stay though, and you can probably expect more big name vendors on the content management scene trying to get in on the action with "holistic, end-to-end solutions".

Fast deal muddies Microsoft search strategy

A lot of people see Microsoft’s agreement to buy Fast as just another example of how mergers and acquisitions are leading to inevitable consolidation in enterprise software generally. I’m not so sure it’s as cut and dried as that.

Most M&A is done to fill a gap in functionality or to grab market share. The Fast deal does both but it would also appear to cut right across Microsoft’s strategy of late last year when it described a plan to develop its search capabilities by organic means with a product called Search Server 2008.

Microsoft now says it plans to integrate Fast with Search Server and SharePoint but, having just cost Redmond $1.2bn, the Fast technology is a racing certainty to be predicated.

It’s a slightly odd state of affairs as it’s only a few months since Microsoft was describing how Search Server would soon be able to compete at the top end of enterprise search but, like Newcastle United parting company with coach Sam Allardyce, one can only assume that Microsoft saw the light a little at an odd juncture.

One report suggests Microsoft also might have taken a close look at Endeca and Autonomy before deciding Fast was the pick of the bunch available. Of course, Autonomy, at perhaps twice the price of Fast, would be pricey given Microsoft’s relatively Scrooge-like attitude to acquisitions, but both of these companies will now be under more sale scrutiny than ever, of course. It will come as no surprise that the most likely buyers are IBM, Oracle and Google.

Incidentally, Fast, like Autonomy, has R&D in Cambridge. That’s Cambridge as in the great university, punting on the Cam and so on, not Cambridge, Massachusetts or some other Cambridge. In enterprise search, at least, there is a part of the tech world that remains forever England.

You pay for what you get

Civil servants are reeling in the wake of the horrific news that CDs containing the records of Her Majesty's Revenue and Customs (HMRC) database have been lost, and the futher news of DVLA data being lost. The full cost to tax paying members of the public may not be fully realised for years to come.

This debacle is not only an example of incredibly poor information management, but also a sign of a wider problem in the UK, that you get what you pay for. Or in this case you don't get what you pay for. 

Information management is, or rather was, at the heart of British life. Travel to former colonies like India or Australia and they'll gladly inform you of the regimented behaviour towards information that led to government structures that have served the sub-continent and prison colony well to date. Yet, those standards have dropped.

An IWR reporter remarked as we debated the issue, how come information of this value was so easy to simply download and burn to a CD?  Technology preventing such blunders is not new and is a basic function of many information management systems.

Revelations of the missing information came a day after a report on the BBC's Today programme that the Driving Standards Agency and vehicle licensing body the DVLA employees take on average three weeks sick leave a year. Missing information and low staff moral are examples of a civil service that is poorly funded and poorly managed.

It is too easy to wag the finger of blame at civil servants, when in truth a much wider debate needs to take place.  As tax payers and child benefit recipients we are angry and worried, as information professionals we are dumbfounded that such lapses could have occurred.  What of our role as citizens?  Since the 1980s we've wanted a John Lewis service, but only paid Tesco value brand prices.  If you want John Lewis quality, you pay John Lewis prices.  On the high street this modus operandi fits well with the public, as they choose when they want quality and when they want to increase their spending. So why is it that we expect our state services to manage high level information on a low level budget?

This needs to be a debate about our society and its values, literally, as well as an improvement in information management.

Information professionals guiding you to the best bits of the blogosphere

Ben Toth reveals how he keeps his information intake healthy and why blogging can be more valuable than social networks such as Facebook.

Q Who are you?
A Ben Toth, 48, domiciled on a farm in Herefordshire. I trained as a librarian at University College
London about 15 years ago. I used to be the director of the NHS National Knowledge Service when it was part of Connecting for Health. The best known service it runs is the National Library for Health (www.library.nhs.uk). Currently, I’m designing the enterprise architecture for the National Institute for Health Research (www.nihr.ac.uk). I’m also writing a book on Health 2.0, which will be published in parts later this year.

Q Where is your blog?
A You’ll find it at http://nelh.blogspot.com

Q Describe your blog and the categories on it
A It’s just a public notebook really. Its content tends to reflect what I’m working on, but it’s mostly about libraries, health and the web. I could use Microsoft Word to keep my notes. I could use del.icio.us. But a blog is more visible and more in the flow of the things I’m reading, which
are almost invariably on the web. A lot of the entries I make are just notings – highlight, right-click and
send to Blogger. I use tags but I’m not very strict about categorising things.

Q How long have you been blogging?
A Since about 2001. Eighteen months ago I lost all my entries and had to start again.

Q What started you blogging?
A I was helping my daughter set up a website as part of a Brownie project she was doing. I couldn’t use the National Electronic Library for Health servers and I didn’t want to manage Apache or pay someone to, so we used Tripod. Which worked, but it was difficult to use. And then I read about Evan Williams’ little project, which became Blogger, had a go with it, and haven’t looked back. It’s become a
habit, and I haven’t got tired of it yet.

Q What bloggers do you watch and link to, and why?
A These days I follow things through RSS if I can, so my blog-watching is mostly via a feed reader. The only blog I regularly visit is Dave Winer’s (www.scripting.com) because he’s taken blog writing to a level where the argument is developed through the day and so needs to be read on the page. I look at Techmeme (www.techmeme.com), but that’s not really a blog. I used to maintain a list of blogs
that I linked to through blogrolling, but I can’t see the point of doing that any more. The social web takes care of that sort of affiliation-showing much better.

Q Do you comment on other blogs?
A I don’t comment much. Sometimes I carp from the sidelines on e-healthinsider (www.e-health-insider.com), but I don’t think there’s much value in commenting or reading comments. That’s not to say that discussion isn’t valuable, but I’d rather read views as blog entries rather than comments on
someone else’s blog.

Q How does your organisation benefit from your blog presence?
A It’s the best way of keeping in touch with what’s going on, and keeping a blog maintains some
visibility to people.

Q How does blogging benefit your career?
A Blogging and RSS are really important for me professionally. They keep me up to date in a way
that nothing else can.

Q What good things have happened to you solely because you blog?
A Making professional contacts that I otherwise wouldn’t have and maintaining ones that might
otherwise have fallen off. In some ways blogging is more useful than LinkedIn and Facebook
as a social networking tool. But it’s really only a matter of time until traditional blogging gets divided
up between Facebook, Vlogging and Twitter.

Q Setting work aside, which blogs do you read just for fun?
A The Fake Steve Jobs blog was great (http://fakesteve.blogspot.com). And when I need a chuckle, I check out the Dilbert RSS feed (http://dwlt.net/tapestry/dilbert.rdf ).

What are the blogs in your sector that you trust?
A The reliably interesting starting points on library matters for me are:
www.earlham.edu/~peters/fos/fosblog.html
http://orweblog.oclc.org
www.philbradley.typepad.com
http://tomroper.typepad.com
And Jon Udell is a first-class technologist who happens to like libraries (http://blog.jonudell.net)

IWR Information Professional of the Year Award

The IWR American Psychological Association Information Professional of the Year award has been announced and went, deservedly to Brian Kelly, UK Web Focus for the UKOLN organisation.

The award is judged by a panel of previous winners and the IWR editorial team. As editor of IWR when I judge the award I look for an individual who is pushing the limits of information, technology and making the role of the information professional as far as possible and making it an exciting role.  When looking through the final results I could see that the other judges felt the same way and Brian was an excellent choice.

Brian's role is a national Web co-ordinator, an advisory post funded by the educational body JISC and the Museums, Library and Archives Council (MLA).

In this role Brian is looking at the web as central resource for learning and research in higher education and is looking at ways to make the web a successful resource, which is a challenging role, because the web is still very young and is constantly changing. This can be seen with the recent changes dubbed Web 2.0, therefore Brian is going to be pretty busy for some time to come.

Based at the University of Bath, I know from information professionals I have dealt with in the academic sector that he is very well respected and his thoughts are often the basis for great debate within the industry. Linked to this is his blog, which is one of the most popular blogs in the sector.

I hope all IWR readers will join me in congratulating Brian for an award very much well deserved. 

Jimmy Wales on the role of Wikipedia in society

Jimmy Wales, chairman of Wikipedia was the keynote speech of Online Information 2007 with a presentation Web 2.0 in action: Free culture & community on the move.

Starts with Britannica editor Charles van Doren 1962, who said the encyclopaedia should be radical, but Wales claims they have been anything but.

Wales280x293 Small showing of hands for those that have edited, although Wales believes it’s a good showing, "but not as many as college kids".

I consider us to be the Red Cross of information, he says as he describes its charitable status. Have 10 full time staff and will spend about $2 to 3 million this year, which is tiny compared to the major publishers. Vast majority of the money is from small donations, which he likes because its grass routes and not dependent on advertisers.

Wales talks about the desire to extend the languages that are in use on Wikipedia, including Hindi and Afrikaans.

Wiki is free in the sense of GNU, its free to copy, modify and distribute.

Shows a video of his travels to India and how he learnt that the local communities want to use the English version, as the English language is a route out of poverty. His organisation has been out to South Africa teaching students how to edit Wikipedia. "One of the things we have learnt is that if you can get five to 10 editors working together, it can make a great difference." These groups make progress and then they look towards outreach and who they can include. Hence the organisation has set up an academy to find the founding editors. It has begun in India, with 10-20,000 articles a month being put together by academy organisations.

Wikia is his next subject, a separate organisation with 66 languages, including a 67th, Klingon. Wales goes on to demonstrate using Google search results for Muppets and how the top result is the official site, but the rest of the results are from web based conversation, ie Wikipedia pages, forums and fan sites. He demonstrates an article on the Ford motor company and how on Muppet Wiki site, there is an article on Muppet Ford ads and how this demonstrates this level of information would never have been available before.

The search engine is a political statement, in a small P sense, Wales says. The proprietary software of the main players is a mystery in that people have no control of the accountability. The Wikia search will publish its algorithm.

Wales believes that the trust of social networks and setting up trusted networks can be utilised in search. .

On the role of collaboration, he asks the audience to imagine that they are designing a restaurants, discussing the idea that we trust the people around us, we don't put people in cages in restaurants because they will be using knives.
The wiki philosophy is to allow people to do good.

Exploitation 2.0

I got a smashing email the other day from a fellow Flickr user. Apparently, they'd shortlisted a picture of mine. How exciting.

Well, turns out, not that exciting. The Schmap shortlist was for a so-so picture taken in Brighton to be published in their online guide to Brighton. So far, so good. Unfortunately, I wouldn't be paid, and Schmap, and  I'm presuming its owners, get perpetual worldwide rights to the image. Free.  If, like me, you love free stuff, that's great news. Except when you're giving stuff - free - to a company that will make money from it. Because although Schmaps are free at the point of consumption, the company makes money by selling advertising off the back of them.

Of course, to look at the web, this is great; Schmaps has clearly got its messaging spot on, and there are tons of Flickr users who think that being published - albeit without being paid for their work - is about as exciting as it gets. Some professional photographers are particularly excited, of course.

SaaS might not fit enterprise search

The rise and rise of software as a service has been such a mantra in the IT media over the last few years that it comes as something of a shock to see the SaaS model actually on the wane in enterprise search. Nevertheless, a recent report by CMS Watch says it plainly: “[the SaaS] model has been a hot topic recently [but] the SaaS model for enterprise search is on the decline”. So, what’s going on here?

CMS Watch itself lists three possible reasons: the preponderance of web-only search in SaaS offerings; the popularity and ease-of-use afforded by appliances; and the competition-squishing presence of Google in the sector.

Let me suggest two more: the fact that free is a compelling price, and the notion that SaaS might not be all things to all men.

Companies looking for a search service today will inevitably be attracted to freebie tasters, especially when the companies offering them -– Microsoft and IBM -- are as big as they come. As discussed earlier, these are highly attractive inducements that offer familiar environments to try out, and a solid upgrade path for those who want to carry on afterwards.

Second, it’s time to admit that SaaS has no Midas effect, except perhaps on marketers. The on-demand model has had a revolutionary effect on customer relationship management and sales force automation, and it is changing the way human resources operates, for example in measuring employee performance. But there are many, many other areas where it has had little or no effect. Even in the much-hyped area of productivity applications where Google and various startups have generated scads of coverage, there has been close to no impact on the hegemony of Microsoft Office, for example.

SaaS is a hugely important trend but privacy concerns, the need to delve into far-flung corners of the enterprise and ancient applications, and sundry other factors mean that search is unlikely to be a happy hunting ground for the model in the immediate future at least.

Information professionals guiding you to the best bits of the blogosphere - Lorcan Dempsey

Lorcan Dempsey has worked for JISC and libraries on both sides of the Irish Sea and the Atlantic. As a member of the National Information Standards Organisation, his blog on networked information and digital libraries is well followed.

Q Who are you?
A I work in Dublin, Ohio, was born in Dublin, Ireland, and spent a long time in between in the UK. I am lucky to have what I believe to be one of the most interesting jobs in the library world. I am responsible for the programmes and research area within OCLC (Online Computer Library Center). I also help shape OCLC strategic direction.

Q Where can we find your blog?
A http://orweblog.oclc.org

Q Describe your blog?
A I say that it is about “libraries, networks and services”. I suppose that over time it has become more general. At first it had more of a technical slant; now it ranges more widely. I tend to talk about how networks are reconfiguring library services and I have some recurrent threads. These include:

Making data work harder.
We invest a lot in bibliographic data and need to use it more imaginatively in our systems and services.
Moving to the network level.
No single website is the sole focus of a user’s attention. The network is the focus of attention. And a major part of our network use revolves around significant network-level services ­ Amazon, Google, IMDB, and so on. These match supply and demand in efficient ways. The real message of Web 2.0 is the emergence of this pattern of service: data hubs with strong gravitational pull generated through network effects.
Being in the flow.
The focus of attention has shifted from website to workflow. The network is not so much about finding things as getting things done, and we have increasingly rich support for “networkflow”. We may construct our personal digital identities around services in the browser or on the network (RSS aggregators, social networking sites, bookmarks, etc), and we use prefabricated workflows (course management system, customer relationship management system, and so on).

Q How long have you been blogging?
A Almost four years.

Q What started you blogging?
A After I arrived in OCLC I tended to send out a lot of emails. A colleague suggested that a blog might be a better model.

Q Do you comment on other blogs and what is the value of it?
A The comments on some blogs seem more important than on others.

Q What are the blogs in your sector that you trust?
A I keep a wide range of feeds in my aggregator and will focus on different ones from time to time. Again, I tend to be more interested in “voice” or those from whom I can learn something. From a library point of view, I look at Caveat Lector (http://cavlec.yarinareth.net) and ACRLog (www.acrlblog.org).

Alma Swan’s new blog, OptimalScholarship (http://optimalscholarship.blogspot.com) and eFoundations (http://efoundations.typepad.com) from Andy Powell and Pete Johnston, are informative and provocative. I find PlanetCode4Lib (http://planet.code4lib.org) an efficient and useful way of keeping up with a range of stuff.

Q What good things have happened to you that could only have happened because of your blogging?
A I have always contributed to the professional literature. But I find that blogging is quite liberating: it is much easier to write blog entries than longer pieces. It has made me write more quickly and to think about short communications.

Q Which blogs do you read just for fun?
A I look at John Naughton’s Memex 1.1 (http://memex.naughtons.org) and William Gibson’s blog (www.williamgibsonbooks.com/blog/blog.asp), and the pictures in YarnStorm (http://yarnstorm.blogs.com) make me smile.

Cause and effect

This post comes courtesy of Metafilter. I could bash on about how, if I'm looking for something  interesting or  controversial when aimlessly browsing I go to Metafilter and not a search engine. I could draw a comparison between the effectiveness of Metafilter as a search engine for really cool stuff and the primacy of a certain search engine. Or how Metafilter does the job right first time most of the time, while the likes of BoingBoing et al show merely occasional flashes of brilliance when compared to the massively parallel user model of Mefi. But I won't, because they're all a bit tenuous, to be honest.
Instead, I'd like to point you toward a posting on Metafilter; if Google were optimised for Google. Click through the page, and it's possible to see how search engines have changed the physical appearance of the web. We're all aware to a certain extent of how external influences change the design and layout of sites, but I was stunned to see the sheer volume of cruft, crap and extra verbiage added to the page in the name of SEO.

Bloggers-in-chief

Daniel Griffin, IWR Deputy Editor Daniel Griffin, IWR Deputy Editor
Daniel joined IWR in 2006 after a career as a publisher of guides, supplements and websites for magazine and event companies. His special interest is the evolving publishing and information industry online.

Peter Williams, IWR Editor Peter Williams, IWR Editor
Peter is in his second spell on IWR. Over the last few years he has developed interest in the fields of knowledge management and e-learning, writing and editing extensively on both topics.

Friends of IWR

LI Isues
James Mullan

Lorcan Dempsey’s weblog
Lorcan Dempsey

SocialTech
Josie Fraser

Jennie Law’s blog
Jennie Law

UK Web Focus
Brian Kelly

tfpl blog
James Lappin

e4innovation
Grainne Conole


Powered by Movable Type
Useful links: About | Privacy policy | Terms & conditions | Top of the page
© Incisive Media Ltd. 2008
Incisive Media Limited, Haymarket House, 28-29 Haymarket, London SW1Y 4RX, is a company registered in the United Kingdom with company registration number 04038503