CSCI E-108

Data Mining, Discovery, and Exploration

Name: Data Mining, Discovery, and Exploration
Price: 16603580 USD
Availability: InStock

Extracting actionable insights and relationships from massive complex data sets is the domain of data mining.

Data mining has wide-ranging applications in science and technology, where data-set size defies use of algorithms commonly applied at small scale.

This course addresses several key aspects of data mining, including the use of key-value pairs and hashing methods to manage and compute analytics for massive scale datasets; highly scalable approximate similarity search and embedding algorithms for information retrieval, as used in retrieval-augmented generation (RAG) algorithms, web search, image search, and recommendation systems; algorithms for ranking search and recommendation results; highly memory-efficient sketch algorithms for infinite sized data, such as streaming data and online processing of massive datasets; unsupervised learning, including clustering models and dimensionality reduction algorithms, for finding and exploring relationships in massive complex datasets; and graph representations and algorithms for search and social network analysis.

The course comprises readings and lectures on theory along with hands-on exercises and projects where students apply the theory through Python coding and interpretation of results.

The hands-on component of the course uses a variety of libraries in the Python language, Scikit-Learn, NetworkX, FAISS, and deep learning platforms and packages.

Students enrolled for graduate credit are required to perform, present, and report on an independent project.

This project must demonstrate a mastery of methods covered in the course as applied to a suitable real-world data set.

Students may not take both CSCI E-96 and CSCI E-108 for degree or certificate credit.

Schedule note

W 6:00pm - 8:00pm Aug 30 to Dec 18

Data Mining, Discovery, and Exploration

Help keep the register running.

Similar courses