Industry Track

The Industry track is a track of the WWW 2014 (International Word Wide Web) conference that will be held from April 9-10, 2013 in Seoul, Korea. We hope to bring the Industry track to a new height that will serve as a key attraction for WWW 2014 and deliver meaningful impact to the World Wide Web community.

The Industry track will feature leading experts in the world of web services. The track will comprise of technical invited talks and possibly panel discussions / debates that focus on innovative and leading-edge, large-scale web services in areas (but not limited to) such as semantic latent search, web commerce, social platforms, payment system, web-based education, web-based game, big data analytics, mobile network performance / power optimization, HTML5 mobile app development, distributed storage for cloud services, Hadoop and related distributed transaction processing technology, advances in HTTP protocol, Deep neural network and its application to web services, etc.

This track will complement the Research and Web Science paper track at WWW which will focus on peer reviewed publications

Prospective Speaker list (Click the name for speaker’s detailed information)

Ashish Goel (Professor, Stanford
University and Reearch Fellow, Twitter, Inc) updated

Twitter’s Recommendation System: Algorithms, Architectures, and Applications

We will describe the initial architecture of Twitter’s “Who To Follow” user recommendation system. We will describe some of the algorithms behind this system, and additional applications of these algorithms for targeting twitter’s promoted products. We will also present theoretical results showing how several of these recommendation algorithms can be implemented at scale on modern data processing architectures such as Map Reduce and Distributed Streaming.
Ashish Goel is a Professor of Management Science and Engineering and (by courtesy) Computer Science at Stanford University, and a member of Stanford’s Institute for Computational and Mathematical Engineering. He received his PhD in Computer Science from Stanford in 1999, and was an Assistant Professor of Computer Science at the University of Southern California from 1999 to 2002. His research interests lie in the design, analysis, and applications of algorithms; current application areas of interest include social networks, Internet commerce, and large scale data processing. Professor Goel is a recipient of an Alfred P. Sloan faculty fellowship (2004-06), a Terman faculty fellowship from Stanford, an NSF Career Award (2002-07), and a Rajeev Motwani mentorship award (2010). He was a co-author on the paper that won the best paper award at WWW 2009.Professor Goel is also a research fellow and technical advisor to Twitter, Inc, where he prototyped several early algorithmic products. The work described in this Abstract has been chosen as a finalist for the Franz Edelman award.
Andrew Kirmse (Google) updated

Google Now and Future Directions in Prediction

Recent advances in mobile devices have made possible a whole new category of contextual recommendation applications. This talk will give an overview of the operation of Google Now, one of the first such applications. We’ll explore how prediction combines user modeling with traditional information retrieval to generate predictions. We will discuss the promise and limitations of such data sources as location and proximity, and how research into analysis of these data sources will change the nature of prediction over the next few years.

Andrew was a founder of Google Now and leads the engineering team. He has also led development of Google Earth, and Google Maps for Mobile. Google Now was Popular Science’s Innovation of the Year in 2012, and Google Maps for Mobile won Best Consumer Mobile Service at the 2012 Mobile World Congress. Prior to joining Google in 2003, he spent 10 years in the videogame industry specializing in graphics and online games.

Andrew received a master of engineering in computer science, and undergraduate degrees in computer science, physics and theoretical mathematics, from the Massachusetts Institute of Technology.

Yoelle Maarek (VP of Yahoo Research) updated

The rise of the machines: The challenges of mining and consuming machine-generated Web mail

In spite of personal communications moving more and more towards social and mobile, especially with younger generations, email traffic continues to grow. This growth is mostly attributed to (non-spam) machine-generated email, which, against common perception, is often extremely valuable. Indeed, together with monthly newsletters that can easily be ignored, inboxes contain flight itineraries, booking confirmations, receipts or invoices that are critical to many users. In this talk, I will discuss the new nature of consumer email, which is dominated by machine-generated messages of highly heterogeneous forms and value. I will show how the change has not been fully recognized yet by my most email clients (as an example, why should there still be a reply option associated with a message coming from a “do-not-reply@” address?). I will introduce some approaches for large-scale mail mining specifically tailored to machine-generated email. I will conclude by discussing possible applications and research directions.
Yoelle Maarek is the head of Yahoo Labs Israel and India, and a Vice President at Yahoo. Her team conducts research activities impacting products such as Yahoo! Mail, Yahoo! Answers and Yahoo! Search. Prior to this, Yoelle was the Director of Google Haifa Engineering Center, which she opened in 2006 and grew to close to 40 team members. There, she led the team that launched “Suggest”, Google’s query completion feature on and YouTube worldwide. From 1989 to 2006, Yoelle was with IBM Research, first in the US, and then in Israel, where she held a number of technical and management positions, eventually leading the search and collaboration department and becoming a Distinguished Engineer. She received her PhD in Computer Science from Technion, in Haifa, Israel. In parallel, during her PhD studies, she spent a year in the Computer Science Department of Columbia University in New York, as a visiting PhD student. She graduated from the “Ecole Nationale des Ponts et Chaussees” in Paris, France, and received her “DEA” (graduate degree) in Computer Science from Paris VI University, both in 1985. Yoelle’s research interests include Information Retrieval, Web search, Web mining and Web applications. She has published more than 70 articles in these fields and is active in the research community. She has served as regular or senior PC member at most recent SIGIR,WWW and WSDM conferences, as PC co-chair of WWW’2009, WSDM’2012 and SIGIR’2012. Yoelle is a member of the Board of Governors of the Technion, chairing its Student Affairs Committee. She is an ACM Fellow since 2013.
David K. Min (Sr. Research Fellow, Software Platform Lab., LG Electronics) updated

David K. Min
Title: Consumer Electronics and the Web: Issues and Challenges
Recently we see a trend toward increased web usage in everyday life thanks to (1) the ubiquitous high speed wireless Internet, (2) standardization of web technologies being now at such a state that HTML5 standards are widely used for comparison among browser software features, and, (3) the advances in semiconductor technologies enabling to design a product with powerful hardware resources at such a reasonable cost that people now expect fairly good user experience in running a browser on consumer electronics products to access various information on the Internet. On the other hand, for better adoption of the web technologies in our products, there are a few hurdles we need to overcome. The more we are accustomed to relying upon web technologies for everyday use, the more we see a trend toward trying to use the web for mission critical applications, However, the system is not quite designed for such applications in the beginning. Also, while we have seen a significant performance improvement over the last several years, we still yearn for better performance so that web applications can provide as good a user experience as one might expect when using a native application. Furthermore, programming using web technology is not quite ready for writing large scale applications. In this talk we will introduce directions LG Electronics is taking to address the hurdles mentioned above.
David Min graduated BS in computer science and statistics from Seoul National University, MS in computer science from KAIST, PhD in computer science from University of Illinois, Urbana. David has worked in various capacities in software industry almost three decades. He started his career as systems analyst in IT department of Samsung Co, a trading company in Korea. After finishing PhD program in Illinois, he worked as principal technical staff at Digital Equipment Corp. for several years, working in the area of database technologies. Since 2000, he has been working in embedded software for consumer electronics. David worked as vice president of software development at Visual Display business unit of Samsung Electronics, senior software engineer at Microsoft IPTV division, technical consultant at Silicon Image. He has been with LG Electronics since 2007 and is now leading Software Platform Laboratory of the company.
Jan Pedersen (Chief Scientist at Microsoft) updated

Jan Pederson
Title: Web-scale Semantic Ranking
Semantic ranking models score documents based on closeness in meaning to the query rather than by just matching keywords. To implement semantic ranking at Web-scale, we have designed and deployed a new multi-level ranking systems that combines the best of inverted index and forward index technologies. I will describe this infrastructure which is currently serving many millions of users and explore several types of semantic models: translation models, syntactic pattern matching and topical matching models. Our experiments demonstrate that these semantic ranking models significantly improve relevance over our existing baseline system.
Jan O. Pedersen is currently a Distinguished Engineer at Microsoft
Pedersen began his career at Xerox’s Palo Alto Research Center (PARC) where he managed a research program on information access technologies. In 1996 he joined Verity (recently purchased by Autonomy) to manage their advanced technology program. In 1998 Dr. Pedersen joined Infoseek/Go Network, a first generation Internet search engine, as Director of Search and Spidering. In 2002 he joined Alta Vista as Chief Scientist. Alta Vista has later acquired acquired by Yahoo!, where Dr. Pedersen served as Chief Scientist for the Search and Advertising Technology Group. Prior to joining Microsoft, Dr. Pedersen was Chief Scientist at A9, an Amazon company.
Dr. Pedersen holds a Ph.D. in Statistics from Stanford University and a AB in Statistics from Princeton University. He is credited with more than ten issued patents and has authored more than twenty refereed publications on information access topics, seven of which are in the Special Interest Group on Information Retrieval (SIGIR) proceedings.
Pavel Serdyukov (Yandex) updated

Analyzing behavioral data for improving search experience at Yandex

Yandex is one of the largest internet companies in Europe, operating Russia’s most popular search engine, generating 62% of all search traffic in Russia, what means processing about 220 million queries from about 22 million users daily. Clearly, the amount and the variety of user behavioral data which we can monitor at search engines is rapidly increasing. Still, we do not always recognize its potential to help us solve the most challenging search problems and do not immediately know the ways to deal with it most effectively both for search quality evaluation and for its improvement. My talk will focus on various practical challenges arising from the need to “grok” search engine users and do something useful with the data they most generously, though almost unconsciously share with us. I will also present some answers to that by overviewing our latest research on user model based retrieval quality evaluation, implicit feedback mining and personalization.
I will also summarize the experience we gained from organizing three data mining challenges at the series of workshops on using search click data (WSCD) organized in the scope of WSDM 2012 – 2014 conferences. These challenges provided a unique opportunity to consolidate and scrutinize the work from search engines’ industrial labs on analyzing behavioral data. Each year we publicly shared a fully anonymized dataset extracted from Yandex query logs and asked participants to predict editorial relevance labels of documents using search logs (in 2011), detect search engine switchings in search sessions (in 2012) and personalize web search using the long-term (user history based) and short-term (session-based) user context (in 2013).
Pavel Serdyukov is the Head of Research Projects at Yandex, where he manages a team of researchers working in the field of web search and data mining. He has published extensively in top-tier conferences on the topics related to web search, personalization, enterprise/entity search, query log analysis, location-specific retrieval and recommendation. He co-organized a number of workshops at SIGIR, was a co-organizer of the Entity track at TREC 2009-2011, and co-organized a series of workshops at WSDM in 2012 – 2014. Recently, he was also the General Chair of ECIR 2013 in Moscow. Before joining Yandex in 2011, he was a postdoc at Delft University, got his PhD from Twente University (2009) and his MSc from Max-Planck Institute for Computer Science (2005).
Ramesh Sitaraman (Professor at UMass, Akamai Fellow) updated

The Billion Dollar Question in Online Videos: How Video Performance Impacts Viewer Behavior?

Online video is the killer application of the Internet. It is predicted that more than 85% of the consumer traffic on the Internet will be video-related by 2016. But, can online videos ever be fully monetized? The future economic viability of online video rests squarely on our ability to understand how viewers interact with video content. For instance:

• If a video fails to start up quickly, would the viewer abandon?
• If a video freezes in the middle, would the viewer watch fewer minutes?
• If videos fail to load, is the viewer less likely to return to the same site?

In this talk, we outline scientific answers to these and other such questions, establishing a causal link between video performance and viewer behavior. One of the largest such studies, our work analyzes the video viewing habits of over 6.7 million viewers who in aggregate watched almost 26 million videos. To go beyond correlation and to establish causality, we develop a novel technique based on Quasi-Experimental Designs (QEDs). While QEDs are well known in the medical and social sciences, our work represents its first use in network performance research and is of independent interest. This talk is of general interest and is accessible to a broad audience.

Prof. Ramesh K. Sitaraman is currently in the School of Computer Science at the University of Massachusetts at Amherst. He is best known for his role in pioneering the first large content delivery networks (CDNs) that currently deliver a significant fraction of the world’s web content, streaming videos, and online applications. As a principal architect, he helped create the Akamai network and is an Akamai Fellow. His research focuses on all aspects of Internet-scale distributed systems, including algorithms, architectures, performance, energy efficiency, user behavior, and economics. He received a B. Tech. in electrical engineering from the Indian Institute of Technology, Madras. and a Ph.D. in computer science from Princeton University.

Hugh Williams (SVP at Pivotal) updated
Hugh Williams
The Third-Generation Platform: What enterprises are learning from consumer Internet companies

Internet companies innovate quickly. With data, experimentation, and iteration they continually evolve their products to meet the demands of consumers. A typical Internet company is running tens or hundreds of parallel A/B tests, keeping every piece of data, and storing it in large centralized data lakes. Importantly, data makes solving problems easier — there has been a revolution toward solving problems with data rather than solving problems with algorithms. Enterprises understand that data science, big data, and Hadoop are the future. The challenge is how to transform companies, and how to change the company DNA to one that experiments, learns, and iterates. In this talk, Hugh discusses how Internet companies work, and explains why this is the future of the enterprise world. He gives real world examples, shows how some companies are crossing the chasm, and outlines the exciting challenges ahead. He explains why — as an consumer Internet veteran — he decided to join Pivotal and help drive this enterprise revolution.
Hugh E. Williams is SVP, R&D at Pivotal Software Inc. He has spent twenty years researching and developing search engines, web services, and “big data” technologies, including long stints leading large teams at eBay and Microsoft. He has published over 100 works, mostly in the field of Information Retrieval, and including two books: “Web Database Applications with PHP and MySQL” and “Learning Mysql” for O’Reilly Media Inc. He holds nineteen US patents and has a PhD from RMIT University in Australia.
Wolfgang Konig (European Patent Office)
Title: Innovation and Intellectual Property in the field of Human Computer Interaction
Igor Perisic (VP Engineering at LinkedIn) updated

Asish Gupta
Some lessons from scaling a large, web-scale social platform

Any company looks very fondly at the potential of explosive growth. It is identified with success. After all, growth means that your members appreciate your offering and exponential growth can be construed to simply mean that these members appreciate it very much. How a company survives when it hits that geyser is an interesting story. Indeed no services are infinitely scalable from their inception, neither does any company have infinite manpower. We’ll take a look at various lessons learned through our experience at LinkedIn. The talk will present some of our experiences from scaling backend infrastructures, developing machine learned models for various relevance problems and the consequences of having a rapidly growing organization. At the core, a picture of Data Integrity and Experimentations will develop as key success factors.
Igor Perisic is currently a VP of Engineering, Data and Analytics, at LinkedIn where he built and manages a team of engineers working in the areas of Search, Real Time Social Graph Engine, Relevance and Relevance Infrastructure. His team is involved in all areas of LinkedIn where Personalization and Machine Learned technologies can better the user experience. These includes such products as People You May Know and the recently deployed Sponsored Updates. A firm believer in Open Source technology, he has scaled his services leveraging some open Open Source components such as Lucene and Hadoop but also contributed significantly back to the community with Apache Kafka, Voldemort or DataFu amongst other. He has spent years researching and developing Search Engines, Recommendation Engines and what is currently referred to as Big Data. Before joining LinkedIn, he spent a year with Microsoft’s Search Labs and prior to this he was involved with various startups in the area of Search and Knowledge Management. He has an undergraduate degree in Mathematics from the EPFL and PhD in Statistics form Harvard
Ashish Gupta updated

Asish Gupta
The Paradoxes of Internet Company Creation in India

India is a fast growing technology economy with a tradition of software development. In addition, there is enormous growth of technology companies – especially using the internet and mobile channels. The growth is driven by both the large scale adoption by connected consumers and by entrepreneurs who are creating companies. In this talk we will focus on the internal aspect of internet/mobile technology companies (HR, technology choices, business challenges). In that context we will discuss different type of companies being created, their choices, and the opportunities and challenges faced by some of them. We will also Abstract out some learnings derived by the experience of some of these companies.
Dr. Ashish Gupta is a co-founder of Helion and serves on the boards of InfoEdge (INFO), Jivox, Kirusa, Komli,, Pubmatic, and SMSGupshup. He has co-founded two successful companies – Tavant Technologies and Junglee (AMZN). He has also worked at Woodside Fund, Oracle Corporation and IBM research. Some of his other investments are Daksh (IBM), Mu Sigma, Odesk, MakeMyTrip and redBus (MIH).Ashish is a Kauffman Fellow, holds a Ph.D. in Computer Science from Stanford University, and a Bachelor’s degree from Indian Institute of Technology Kanpur where he was awarded the President’s Gold medal. He has authored several patents, publications, and a book by MIT press.

Industry track Co-chairs

• Rakesh Agrawal (Microsoft Research)
• Chang Song (Naver)

Industry track advisory committee member

• Adam Messinger (Twitter CTO)
• Won Kim (Professor at Gacheon University)
• Henk Goosen (OptumSoft CEO/CTO)
• Myle Ott (Facebook)
• Greg Malewicz (Facebook)
• Ron Brachman (Head of Yahoo Labs)
• Prabhakar Raghavan (VP of Engineering at Google)
• Qi Lu (Executive VP at Microsoft)
• Rajesh Parekh (Sr. Director at Groupon)