Pablo Mayrgundter

pablo.mayrgundter@gmail.com
http://freality.org/~pablo/resume/

Professional

My professional interest is to take part in the formation and development stages of organizations or teams that bring large-scale computing, analytics and modelling technologies to challenging market or community applications. I am particularly interested and experienced in opportunities involving High Performance Computing, Datamining, Artificial Intelligence and Virtual Reality.

Google

Google is a leading search engine that also engages in cutting-edge computer science and engineering. The Site Reliability Engineering group at Google is responsible for the production systems which perform the vast majority of the company's computing work. (Although out-of-date, this Wired article gives a sense of the size of these systems and how they're used.)

  • Tech Lead, Instant Indexing Production Instant Indexing is the system which produces search results for documents recently published to the web. I'm responsible for ensuring its availability, scalability and performance.
  • Tech Lead, Blog Search Production Blog Search is the system which produces search results for blogs published to the web, and has a dedicated subdomain at blogsearch.google.com. I'm responsible for ensuring its availability, scalability and performance.

Panther Express

Panther was started in the Summer of 2005 to serve the growing Content Distribution market and particularly to target the space left by the acquisition of the number two provider Speedera by the main provider Akamai.

Panther has since merged with CDNetworks.

  • Co-Founder: As co-founder I was challenged to help build an organization that could quickly develop and deploy a world-class CDN solution. Panther succeeded in this goal by going from first planning meeting to beta trials of a 15 city, 5 network CDN within 6 months with minimal capital expenditure. Panther closed its first round of funding in July of 2006.
  • Architect: As architect, I designed and developed a high-capacity caching HTTP/1.1 server, assisted in network capacity planning, traffic routing and load-balancing design for Panther's worldwide, 20 city, 10 network content distribution service.

Doubleclick

Doubleclick is the leading Internet marketing company, serving trillions of ads per year around the world in a variety of media.

DoubleClick has since been acquired by Google.

  • Senior Software Engineer, DartOne: Team member of a small R&D group developing a next generation ad and media serving architecture for Doubleclick's extremely high traffic rates and rich service offerings.
  • Software Engineer, Sonar: Team member of a smal R&D group reporting to the Co-Founder/CTO, developing a automated shopping search-engine/portal. Sonar was eventually spun-off as ShopWiki.

Designed and implemented various web-page information extraction techniques, including methods based on bayesian text classification libraries and custom-made logical inference learners.

Designed and implemented a partially HTTP/1.0-compliant web media server capable of servicing 10-20k simultaneous connections on a commodity processor using Java NIO, thus greatly reducing the number of machines needed to support DoubleClick's ~8 billion daily ad serves. The server was zero-copy and dual-threaded, the second thread being used for arbitrary statistics collection. The server went live in 2006 on DoubleClick's main ad-serving system but was later retired. Ah well.

Assisted in development of a high-capacity web crawler.

Consolidated a collection of idle pre-deployment Linux servers into a 64-CPU task-oriented processing cluster. Helped factor one of the largest Cunningham numbers of 2006.

Reel Two

Reel Two produces high-performance information analytics solutions for industries challenged by increasing volumes of complex data such as Life-Sciences, Legal and Media.

Reel Two has since been acquired by NetValue.

  • Co-Founder: Assisted in developing a strategy for our company in the Knowledge Management industry after exploring emerging information markets and their associated product and service opportunities.
  • Director of Applications and Services: Led design and use-case development of prototypes and successive major version releases of applications for dataset analytics and classification, document entity extraction and web-based document management portals. Presented solutions to clients and advised licensees in architecture planning.

Designed and assisted in construction of a document search solution for a major client that exceeded the price/performance capabilities of solutions available on the market. System managed tens of millions of documents using Lucene, parallel index building across a large Linux cluster and index merging on a large memory (16GB) machine.

Designed and constructed a document portal product with search, classification and multi-user curation functionality. The portal accommodated a dozen simultaneous researchers concurrently searching and modifying tens of thousands of document categorizations across a large client-defined taxonomy. Updated curations were used to reorder the documents within the full taxonomy in real-time.

Conducted natural language research in automatic term disambiguation using context windowing around target terms to model various meanings.

Webmind (formerly Intelligenesis)

(site is archived here)
Webmind was a global Artificial Intelligence R&D company started in 1998. I left school early to be one of the first employees here. Here is a description of what we were working on.

Webmind closed in 2001, but produced two spin-offs, Market Predictor and Reeltwo, my next company.

  • Project Lead: Led the New York Webmind Classification System (WCS) team, which produced the first and primary product of the company. Market analysis and product specification for 1.0-2.0 releases of WCS. Pre-sales engineering for WCS.
  • Researcher: Assisted in development of the core research platform, including work in distributed network datastructures and Java VM analysis.

Managed a parallel processing project team that developed an architecture capable of distributing workloads for applications such as fractal computation and prime factorizations that yielded linear speedups on test applications.

SprintLink

Sprintlink develops and maintains Sprint's national internet backbone, one of the largest in the world.
  • Software Engineer: Researched and prototyped a system to model the traffic flows of Sprint's national internet (IP) backbone. Designed and implemented a database of Sprint's USENET clients for real-time network administration. Helped design and implement a web front-end to this database, for both internal and client use. Acted as substitue administrator of Sprint's national USENET services for a two-week period.

J.T.Smith and Associates

J. T. Smith is a small computing consultancy. Assisted in corporate sales, account managment, system administration and web site design and development.

Advancenet

Advancenet is an Internet service provider in operation since the beginning of the Internet. Assisted in sales of leased lines (56k-T1/DS1) and Internet services, web site design and development, UNIX administration, and technical support.

Previous Experience

My first job was delivering newspapers for The News Gazette, which I started around age 12 and continued through early high-school. Also during this time I had various small initiatives involving lawn mowing, snow shovelling, corn detasseling, and the like. I stopped the paper route because of a move. In my new neighborhood, I served as a waiter at the Clark-Lindsey Retirement Village for 1 or 2 years, and finished up my time in high-school as a dish washer at The Art Mart in Downtown Urbana. After High School, I was fortunate enough to attend University.

Academic

My academic pursuits have focused on studying the fundamental concepts of Computer Science and mastering the advanced topics of Machine Learning associated with Artificial Intelligence and Natural Language Processing, a long-term and challenging goal.

I have no degree. I left school with a semester left so that I might work at Webmind as one of the first employees. While at Webmind I took a CS independent study at Columbia because that was the only credit I needed from an equivalent school to CMU according to my advisor. The rest of the credits I need to complete my degree are general electives and a science course, though I have postponed these indefinitely in preference for my other pursuits.

Carnegie Mellon University

School of Computer Science.

Columbia University

Independent Study in Programming Languages.

Skills

I have experience in a variety of computing systems and have focused on Java, UNIX and Web development.

Java

Programming experience in: NIO, JDBC, JFC (AWT/Swing/i18n), Java3D, JavaMail, Java XML, Lucene.

Access my public code library here.

Linux

Use and Administration in: Linux, Emacs, GNU Tools (e.g. bash, sed, awk, grep).

I am working on a kernel patch to monitor per-process I/O usage for reporting by the procps tools.

Web

Design and development experience in: Apache, Tomcact, XML, X/HTML, XSL, CSS, RDF, JSP/Servlet, PHP, Perl, SourceForge, Scoop.

Designed and/or manage: reeltwo.com, freality.org, vanvorstpark.org, fvvp.org, myshytune.com.

Research and Development

My research and projects focus on the following topics. These projects are ongoing. More information about each is linked:

  • Virtual Reality: My goal is to develop extensible structures for overlaying multiple naturalistic datasources in a virtual environment. I have developed a basic solar system simulator and am extending it to use live data from the Sloan Digital Sky Survey.
  • Operating Systems/GUIs: My goal is to have continuing experience in modern OS architectures and GUIs. Projects at school included implementing UNIX-style memory and file system datastructures, semaphore and monitor process synchronizers. Since then I have focused on developing a Linux distribution named w00tix that is small (1-2MB kernel and 16MB total size) and focused on running Java apps, including a simple desktop environment.
  • Library and DRM: My goal is to develop internet-based lending technologies that enhance the public domain while satifying copyright challenges. I have created an test online library system that automatically serves common media formats via a web interface. I am currently adding Digital Rights Management facilities to the library system so that copyrighted works can be loaned without copyright infringement.
  • Machine Learning/Natural Language Processing/Artificial Intelligence: My goal is functional understanding of the basic mechanisms of intelligence, e.g. learning, memory and language. My AI projects include natural language corpus creation and analysis, development of a statistical parser using Deniz Yuret's Lexical Attraction method, and designing and implementing knowledge frameworks for semantic relationships deriving from various aspects of John Anderson's ACT-R, Ben Goertzel's Psynet and Doug Lenat's CYC. I am creating a set of tools and models for machine learning and thought process modelling.
  • Networking: My goal is to have continuing experience in modern networking technologies. Projects have included implementing IPv6 network and transport protocol layers, an FTP client/server application on top of this, and an IPv6 router extended with ATM technology along the lines of Cisco Systems' tag switching protocol while at CMU. Since then I've focused on thinking about p2p systems and co-founded an Internet Content Distribution Network.

Inventions

A full list of my inventions for IP agreements is listed here.

Hacks