Conference 2015
Top image

 
Home
Program LNMB Conference
Invited Speakers LNMB Conference
Program PhD presentations
Abstracts PhD presentations
Registration LNMB Conference
Announcement NGB/LNMB Seminar
Abstracts/Bios NGB/LNMB Seminar
Registration NGB/LNMB Seminar
Registered Participants
Conference Office
How to get there
 
Return to LNMB Site
 

Laurens van der Maaten: Constructing Maps to Visualize Big Data

Abstract:
Visualization techniques are essential tools for every data scientist. Unfortunately, the majority of visualization techniques can only be used to inspect a limited number of variables of interest simultaneously. As a result, these techniques are not suitable for big data that is very high-dimensional.
An effective way to visualize high-dimensional data is to represent each data object by a two-dimensional point in such a way that similar objects are represented by nearby points, and that dissimilar objects are represented by distant points. The resulting two-dimensional points can be visualized in a scatter plot. This leads to a map of the data that reveals the underlying structure of the objects, such as the presence of clusters.
The talk gives an overview of techniques that can be used to construct such maps. In addition, we present a new technique to construct such maps, called t-Distributed Stochastic Neighbor Embedding (t-SNE). We demonstrate the value of t-SNE in domains such as computer vision and bioinformatics, and we show how to scale up t-SNE to Big Data sets with millions of objects.