High Perfomance Pandas: Leveraging the power of pandas data manipulation.


Pandas is an standard data handling library commonly misused, however, this powerful library includes a set of APIs created for high speed data manipulation hidden to most of people. This talk will show how to use pandas properly, performance benchmarks against basic data structures and provide a deeper understanding of pandas data handling. This talk aims to help making data analysis work easier and cleaner by exploiting the full power of pandas.

Type: Charla extendida, 45 minutos (explicar motivos)

Level: Medium

Speakers: Luis David Camacho

Speakers Bio: Luis David is passionate for data science and big data, in his day to day implements machine learning algorithms. His background is Systems Engineering with more than 10 years of experience designing, developing information systems. His most recent experience covers the use of machine learning algorithms for cybersecurity.

Time: 15:00 - 16:00 - 12/05/2019

Room: AB - Onapsis

Labels: python pandas data analysis data


This talk gives a walkthrough pandas API, showing all the requeried tools for day-to-day work of every data analyst or data scientist, * Intro * Starting a pandas data set from scratch *Combining DataFrames: * Merge * Concat * Append * Functional Approach: * Using Lambda functions * Using defined functions * Data Querying: * The .at & .iat operator * Loc, query, eval * Multi-level index * Series and DataFrames Operations * Selecting Columns * Windowing & Expanding * Grouping and Aggregations: * .groupby * Merge, join, concat, & append * Aggregations