Computer-assisted curation : let's figure out the best system to help scale up curation operations!

It seems that lately the two concepts of knowledge management and content curation are definitely getting closer and closer as providers of private social networks for companies try to find the holy grail of social collaboration.

At the same time, there is a great wave of discussion and innovation coming from the world of journalism where the community is trying to solve the issue of the scalability of the current human-based curation model.

A great tweet posted last week by David Clinch illustrates this duality between two approaches : either completety human-based curation and knowledge management, or either algorithmic recommendations. His tweet was stating the obvious, we need a system that leverages both!

As a lot of commentators willingly admit, human curation, even when assisted by several tools that help in discovering information, really takes a lot of time and most likely does not scale. On the other hand, human curation is absolutely needed, especially in the context of a business where real people are the best positioned to share the crucial information to their team partners.

I think that while it's true that algorithmic curation can assist humans in doing a proper quality curation, it's not easy to determine up to which point the algorithmic curation should be involved, and especially in what ways.

There are some options that are already available today, and the most used up until now is represent by filtering tools that would search within your existing networks or sources to find interesting content.

This can eventually work for a curator that already has a lot of sources and twitter accounts to follow, but even then it seems that this approach will block the user to it's own network and not necessarily allow a proper discovery system.

As for recommendation systems, they tend to run in circles actually searching for really similar information to the one already "saved" or "liked" while maybe the reason this info was saved or liked was because of the quality of the text or the quality of the source…

As for pure aggregators, they don't really save time as the filtering aspect of dealing with an aggregator is gigantic.

Here are some ideas where we think human and team-based collaborative curation and knowledge management could benefit from algorithmic curation.

1) Discovery of new sources and networks : this is one of the hardest part for a curator, keeping up with the info and especially making sure they are monitoring the proper channels. By teaching a machine about the kind of sources and users a curator is looking for, a machine could process from the incredible mass of sources and people out there to figure out those who are likely to be trusted sources of information.

By using techniques of text analysis, social reach, semantic density, popularity and more, this task could be done by a machine.

2) Learning the profile of a curator : A lot of engines are focusing on filtering the semantic meaning of an article in order to recommend other content. But by using advanced NLP techniques and text extraction methods, we could go further and have an idea of the tone, the lenght and other signals that can indicate the preferences of a human curator, other than simply the actual keywords used in the text.

3) Social recommendations : It's interesting to explore the idea of user to user recommendations in context of a knowledge management / curation operation. By detecting users that seem to click, like, share or save the same articles, we can connect them together to mutualize their search and discovery operations, in order to speed things up.

In conclusion, I think pre-filtering is the key : to be able to scale up human-based curation, we are implementing a pre-filter system that takes into account the value of the source, the reach of the article on social media, the actual meaning, tone, length and other characteristics of valued content in order to give the curators an already high value, high interest content list to curate.

What do you think? Do you see any other aspects where algorithms could help scale up content curation?