Sanjay Goil

Sanjay Goil

Cupertino, California, United States
3K followers 500+ connections

Über uns

Product leader with focus on product management, developer evangelism, strategic…

Articles by Sanjay

Activity

Join now to see all activity

Erleben Sie

  • Oracle Graphic

    Oracle

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    Bengaluru Area, India

  • -

    San Francisco Bay Area

  • -

    Greater Chicago Area

  • -

  • -

  • -

  • -

  • -

  • -

  • -

    Murray Hill, NJ

  • -

    New Delhi Area, India

Bildung

  • University of California, Berkeley, Haas School of Business Graphic
  • Activities and Societies: Highly Parallel OLAP and Data Mining [Selected Publications] Bit Encoded Sparse Structure (BESS) https://bit.ly/2KlaZjM High performance OLAP and data mining on parallel computers, DMKD'97; MAFIA: Efficient and scalable subspace clustering for very large data sets SIGKDD'99 [355 citations]; Parallel algorithms for clustering high-dimensional large-scale datasets [Book(2013) Chapter 19: Data Mining for Scientific and Engineering Applications];

Licenses & Certifications

Volunteer Experience

  • Silicon Valley Monterey Bay Council, Boy Scouts of America Graphic

    Assistant Scout Master

    Silicon Valley Monterey Bay Council, Boy Scouts of America

    - Present 6 years 8 months

    Children

    Troop 566 in Saratoga helps boys go through their scouting journey and live the scout oath and law everyday. We meet every Thursday at 7:30pm at Menlo Church.
    > I serve as the ASM for a Patrol and as Den Chief Coordinator

  • Silicon Valley Monterey Bay Council, Boy Scouts of America Graphic

    Assistant Scout Master (ASM)

    Silicon Valley Monterey Bay Council, Boy Scouts of America

    - 1 year 3 months

    Children

    Troop 457 in Sunnyvale helps boys go through their scouting journey and live the scout oath and law everyday. We meet every Thursday at 7pm at Ortega Park.
    > I serve as the ASM for a Patrol and as the Troop Quartermaster
    > I was the crew lead for a high adventure backpacking trip (100 miles, 12 days) to the Boy Scout ranch Philmont in summer 2017

Publications

  • HAVEn Workbench: Enabling Analytics on Big Data

    HP TechCon

    HAVEn is HP's Big Data Platform. Success of the platform depends on the applications developed on it. This paper describes the HAVEn workbench, a cross-engine layer that provides out-of-box ability to create new analytics applications by application developers and data scientists, hosted on the cloud ( SaaS).

    Other authors
  • Simplifying High Performance with Intel® Parallel Studio XE and Intel® Cluster Studio Tool Suites

    The Parallel Universe Magazine

    HPC programmers have traditionally been able to use all the
    compute power made available to them. Even with the performance
    leaps that Moore’s law has allowed Intel architecture to deliver over
    the past decade, the hunger for additional performance continues to
    thrive. There are big unsolved problems in science and engineering,
    physical simulations at higher granularities, and problems where the
    economically viable compute power provides lower resolution or
    piecemeal…

    HPC programmers have traditionally been able to use all the
    compute power made available to them. Even with the performance
    leaps that Moore’s law has allowed Intel architecture to deliver over
    the past decade, the hunger for additional performance continues to
    thrive. There are big unsolved problems in science and engineering,
    physical simulations at higher granularities, and problems where the
    economically viable compute power provides lower resolution or
    piecemeal simulation of smaller portions of the larger problem.
    This is what makes serving the HPC market so exciting for Intel, and
    it is a significant driver for innovation in both hardware and software
    methodologies for parallelism and performance.

    See publication
  • A Scalable Parallel Subspace Clustering Algorithm for Massive Data Sets

    International Conference on Parallel Processing

    In this paper we present a scalable parallel subspace clustering algorithm which
    has both data and task parallelism embedded in it. We also formulate the
    technique of adaptive grids and present a truly un-supervised clustering algorithm
    requiring no user inputs. Our implementation shows near linear
    speedups with negligible communication overheads. The use of adaptive
    grids results in two orders of magnitude improvement in the computation
    time of our serial algorithm over…

    In this paper we present a scalable parallel subspace clustering algorithm which
    has both data and task parallelism embedded in it. We also formulate the
    technique of adaptive grids and present a truly un-supervised clustering algorithm
    requiring no user inputs. Our implementation shows near linear
    speedups with negligible communication overheads. The use of adaptive
    grids results in two orders of magnitude improvement in the computation
    time of our serial algorithm over current methods with much better quality
    of clustering.

    Other authors
    See publication
  • A Parallel Scalable Infrastructure for OLAP and Data Mining

    1999 International Conference on Database Engineering & Applications

    Decision support systems are important in leveraging information present in data warehouses in businesses like banking, insurance, retail and health-care among many others. The multi-dimensional aspects of a business can be naturally expressed using a multi-dimensional data model. Data analysis and data mining on these warehouses pose new challenges for traditional database systems. OLAP and data mining operations require summary information on these multi-dimensional data sets. Query…

    Decision support systems are important in leveraging information present in data warehouses in businesses like banking, insurance, retail and health-care among many others. The multi-dimensional aspects of a business can be naturally expressed using a multi-dimensional data model. Data analysis and data mining on these warehouses pose new challenges for traditional database systems. OLAP and data mining operations require summary information on these multi-dimensional data sets. Query processing for these applications require different views of data for analysis and effective decision making. Data mining techniques can be applied in conjunction with OLAP for an integrated business solution. As data warehouses grow, parallel processing techniques have been applied to enable the use of larger data sets and reduce the time for analysis, thereby enabling evaluation of many more options for decision making.

    Other authors
    See publication

Patents

  • Smart Prefetch

    Issued US

    Other inventors

Honors & Awards

  • First Facebook, now Twitter — Hear at Strata how they use HAVEn to solve Big Data questions

    Strata Big Data Conference

    https://conferences.oreilly.com/strata/strata2014/public/schedule/speaker/167263
    From HP, I will be among the presenters appearing alongside Twitter Senior Database Reliability Engineer Josh Varner to discuss how Twitter is using HAVEn to run their Big Data analytics. It’s an incredible story: Every day the social network will ingest about 100 TBs of data and tens of thousands of Hadoop jobs. It has used HAVEn to integrate HP Vertica with their Hadoop infrastructure to deliver the scale and…

    https://conferences.oreilly.com/strata/strata2014/public/schedule/speaker/167263
    From HP, I will be among the presenters appearing alongside Twitter Senior Database Reliability Engineer Josh Varner to discuss how Twitter is using HAVEn to run their Big Data analytics. It’s an incredible story: Every day the social network will ingest about 100 TBs of data and tens of thousands of Hadoop jobs. It has used HAVEn to integrate HP Vertica with their Hadoop infrastructure to deliver the scale and speed needed to analyze 400 million tweets a day for recommendations and revenue growth opportunities.

  • Scaling Performance Forward with Intel Developer Tools

    Intel Developer Forum

    http://openlab.web.cern.ch/sites/openlab.web.cern.ch/files/presentations/A%20report%20from%20the%20INTEL%20Developer%20Forum.pdf

Languages

  • Englisch

    -

Organizations

  • ACM, IEEE

    -

Recommendations received

More activity by Sanjay

View Sanjay’s full profile

  • See who you know in common
  • Get introduced
  • Contact Sanjay directly
Join to view full profile

Other similar profiles

Gemeinsame Artikel erkunden

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Sanjay Goil

Add new skills with these courses