ADA Lab
  • Research
    • Seminars
    • Projects
    • Publications
    • Awards
  • APIs
  • People
  • Blog

ADA Lab blog

  • Sparkling-ferns for #ApacheSpark (Part 1: The Algorithm)
    17 November 2015 by Piotr Dendek
    Two weeks ago together with Mateusz Fedoryszak I attended the first european Spark Summit (#SparkSummitEU). What did we find there and how did we enrich Spark Community? Let me tell you the story of the summit and Sparkling Ferns...
  • PhD defense of ADA Laber Mateusz Kobos
    18 June 2015 by Mateusz Kobos
    On 2015-06-11, I defended my PhD thesis entitled "Multiresolution classification using combination of density estimators" in the Systems Research Inistitute of the Polish Academy of Sciences.
  • CERMINE wins award at ESWC 2015
    17 June 2015 by Łukasz Bolikowski
    ADA Lab's CERMINE participated in the Semantic Publishing challenge during the recent Extended Semantic Web Conference (ESWC 2015) in Portorož, Slovenia and we won the Best Performing Approach Award!
  • Introducing ADA Lab Open Science APIs
    11 May 2015 by Mateusz Fedoryszak
    Having our roots in the Centre for Open Science (CeON) we're very keen on making sure anybody interested can take advantage of algorithms we design. Today we are making another step in that direction: we introduce ADA Lab Open Science APIs.
  • Text Mining Services in OpenAIRE
    16 February 2015 by Łukasz Bolikowski and Mateusz Kobos
    Recently in Athens there was an impressive kick-off of the OpenAIRE2020 project, during which we presented OpenAIRE’s plans in the area of text and data mining of scholarly publications. Publications contain all kinds of rich information, which, although understandable to a human reader, are not machine-readable and thus cannot be used directly for indexing and recommending purposes. Authors’ affiliations, document classifications, references to biological and chemical databases, acknowledgem...
  • Let's join FORCEs and make a difference in scholarly communication
    02 February 2015 by Dominika Tkaczyk
    Two weeks ago I participated in FORCE2015 in Oxford. It was a third conference organized by FORCE11 community and a must-attend event for people interested in scholarly communication, and in particular its problems and various ways of addressing them.
  • Kraków – where AI meets the law
    22 December 2014 by Michał Łopuszyński
    Recently, I was lucky enough to participate in the JURIX 2014 conference, taking place in Kraków, 10-12 December 2014. This was an event aimed at injecting the advancements of computer science into the legal domain. I must admit that the Organizers really achieved their goal. At least from my strongly computer-scientish perspective... During the conference, I presented a proof-of-concept study on how to detect and analyze topical trends in public procurement judgments. You can have a look ...
  • Spark, D3, data visualization and Super Cow Powers
    26 November 2014 by Mateusz Fedoryszak
    Did you know that the amount of milk given by a cow depends on the number of days since its last calving? A plot of this correlation is called a lactation curve. Read on to find out how do we use Apache Spark and D3 to find out how much milk we can expect on a particular day.
  • Affiliation parsing in CERMINE
    13 November 2014 by Dominika Tkaczyk
    CERMINE is our Java library for extracting metadata from scientific literature. Among other information, CERMINE extracts the authors of the input document, their affiliations, and also associates authors with affiliations. Recently new functionality has beed added: affiliation parsing.
  • Interview with Michael Jordan about machine learning, big data, and other things
    27 October 2014 by Mateusz Kobos
    Recently, IEEE Spectrum interviewed Michael Jordan - a leading researcher in machine learning. He gave his view on hype in machine learning as well as in big data analysis and presented his point of view related to some other interesting issues (technological singularity, P=NP, Turing test).
  • Paperity chooses CERMINE as its content extraction engine
    23 October 2014 by Selcuk Ayguney & Marcin Wojnarski
    This is a guest post by Selcuk Ayguney and Marcin Wojnarski, creators of Paperity. We invited the authors to share their reasons for choosing ADA Lab's (recently awarded) CERMINE as their content extraction engine. Here's their story.
  • Summer internship at ADA Lab
    06 October 2014 by Jan Lasek
    My name is Jan Lasek and I was an intern at ICM ADA Lab team in the summer time. And I need to say that it was a great experience to work here!
  • Impressions from PolTAL 2014
    30 September 2014 by Michał Łopuszyński
    A couple of days ago, members of our lab participated in PolTAL 2014, a conference bringing together linguists, computer scientists, and other researchers involved in computational linguistics and natural language processing.
  • Mind the gap! – DL2014
    23 September 2014 by Łukasz Bolikowski, Mateusz Fedoryszak & Dominika Tkaczyk
    Recently a few people from our lab visited London to participate in the Digital Libraries 2014 which was a conjunction of TPDL and JCDL – two best-known conferences on digital libraries.
  • Want to remember Spark API or learn Scala? Use our courses on memrise.com
    15 September 2014 by Piotr Dendek
    You need 20 hours to be initially good at something and 10000 hours to be an expert in any domain. Be an expert easier and faster!
  • datadr: split-apply-combine package for R backed by Hadoop
    04 September 2014 by Mateusz Kobos
    datadr is a package for the R programming language that provides a functionality of split-apply-combine for data transformation. See the Quickstart section in project's documentation for a nice overview of package's capabilities.
  • CockroachDB: an open source version of Google Spanner
    25 July 2014 by Mateusz Kobos
    A team of ex-Googlers is building an open source version of Google Spanner, i.e., a transactional database that spans across many data centers.
  • Building Apache Spark App with Maven
    15 July 2014 by Artur Czeczko & Mateusz Fedoryszak
    Recently we've been working on building Spark apps with Maven.
  • Data science workflow
    13 June 2014 by Mateusz Kobos
    Description of a workflow of a data scientists published on CACM blog.
  • FUSE: project for mining game-changing technologies from scientific publications and patents
    12 June 2014 by Mateusz Kobos
    In May's Nature, there is a column about an interesting text mining project called FUSE. The project is backed by US intelligence agency; its goal is to predict game-changing technologies based on mining of scientific publications and patent applications.
  • At an OpenAIREplus technical meeting in Pisa
    06 June 2014 by Mateusz Kobos
    Last week, three of us (Mateusz, Marek, Paweł) attended a technical meeting of the OpenAIREplus project in Pisa.
  • CERMINE wins Best Student Paper Award at DAS conference
    11 April 2014 by Dominika Tkaczyk
    CERMINE system was presented yesterday at this year's Document Analysis Systems conference. Our article entitled "CERMINE - automatic extraction of metadata and references from scientific literature" won ITESOFT Best Student Paper Award.
  • On big trouble with big data at TOK FM
    10 April 2014 by Łukasz Bolikowski
    Yesterday at TOK FM (a popular Polish talk radio) I discussed with Cezary Łasiczka about the recent article in FT.com by Tim Harford titled "Big data: are we making a big mistake?".
  • Scoobi, Scalding, Spark, Stratosphere – ICM at Scalar 2014
    05 April 2014 by Mateusz Fedoryszak & Michał Oniszczuk
    We had a talk about Scala in ADA Lab at the Scalar 2014 conference.
  • Perfect Data Analysis for Every Moment – ICM at Spotify 2014
    26 March 2014 by Piotr Jan Dendek Mateusz Fedoryszak & Michał Oniszczuk
    Since monday we have started our one week in-house cooperation with Spotify at its Stockholm HQ.
  • Visit at ScraperWiki
    26 March 2014 by Dominika Tkaczyk
    Last week I spent in Liverpool visiting ScraperWiki. ScraperWiki provides tools for extracting, cleaning, analysing and managing data coming from various sources.
  • Article: Cloudera Oryx as the next Mahout
    13 March 2014 by Mateusz Kobos
    Quite interesting article on Gigaom.com which says that Cloudera is developing a system called Oryx. The system is aiming to be a better Mahout.
  • Mathematical modelling workshop for talented youth
    31 January 2014 by Łukasz Bolikowski
    Every year for almost 20 years, in collaboration with Polish Childrens' Fund, ICM organizes weekly workshops for talented youth. This year's edition has just finished.
  • Debugging and manipulate function in RStudio
    30 December 2013 by Mateusz Kobos
    An information for R and RStudio enthusiasts about cool new features in the most recent version of RStudio (0.98) which I noticed today.
  • 12-factor app
    17 December 2013 by Mateusz Kobos
    12-factor app is a manifest or a set of good engineering practices for modern web applications (but not only for them) created by people from Heroku, based on their huge experience.
  • Facebook Presto
    08 November 2013 by Mateusz Kobos
    Facebook just open sourced its Hadoop solution called Presto for doing SQL queries on Big Data.
© 2013-2015 ICM, University of Warsaw. Some rights reserved. Unless otherwise marked, content on this website is licensed under the Creative Commons Attribution 4.0 license.