👥 1 conference
🎤 6 talks
📅 Years active: 2021 to 2021
No biography available.
1 known conferences
Have you always wanted a flexible & interactive visualization that is easy for others to work with without handling all the Javascript libraries? Or do you want to build a user interface for your Machine Learning Model? This talk has you covered with building data apps in Python using Streamlit with examples of a Travel Visualization App using Google Maps Data & an UI for the ImageNet Model.
In this talk, I showcase couple of different use cases where you can build your data focussed applications using Streamlit, an open source library in pure Python.
In the first use case, I cover how you can build interactive dashboards using different Streamlit components. These dashboards can be easily deployed & the consumers can easily work with the interactive dashboards without worrying about all the dependencies that need to be installed to run the Jupyter notebooks. In the showcase, I will go over how you can build a dashboard of your historical travels using Google Maps Location History including some memories from them from Flickr.
In the second showcase, I will describe how users can create a quick interface for their machine learning model using Streamlit. These interfaces are much faster to develop than building a custom frontend interface for machine learning models with the help of Javascript libraries. In the demo, I showcase how I built an UI for the ImageNet Model.
The showcases will showcase how these data based web apps can be built using Python functions & Streamlit components.
Are you an educator who wants to design teach an industry-aligned curriculum? Then you have come to the right place. In this talk, we will show how to design a better curriculum using natural language processing libraries in python, i.e., spaCy and Textacy.
The curriculum in the general and undergraduate curriculum, in particular, is one of the most important pillars of an education system. The undergraduate curriculum has two main objectives i.e. employability and higher education. The greatest challenge in designing an undergraduate curriculum is achieving a balance between employability skills and laying the foundation for higher education. Generally, the curriculum is a combination of core technical subjects, professional electives, humanities, and skill-oriented subjects. We used natural language processing and machine learning packages in Python to build a curriculum design system.
The steps to build a curriculum design system are described below: 1. The dataset was built from the job profiles from different job listing websites like stackoverflow.com, indeed.com, linkedin.com, and monster.com. Also from the syllabus of competitive exams and qualifying exams for higher education. 2. On the dataset, we applied natural language processing techniques to identify the subjects and subject content. For natural language processing, we used spaCy an industrial-strength Natural Language Processing package in Python. 3. To generate syllabus content for a particular subject, a pointer-generator network was used. The pointer generator network is a text summarization technique that combines extractive and abstractive summarization techniques. The extractive summarization technique extracts keywords from the dataset, whereas the abstractive summarization technique generates new text from the existing text. The pointer-generator network was implemented using the scikit-learn machine learning package in Python. 4. The generated curriculum was then compared with the existing curriculum to get insights like, how much percent of the curriculum is industry oriented, how much percent of the curriculum is aimed at higher education and job-oriented skills. At this step, we used the ROGUE (Recall-Oriented Understudy Gisting Evaluation) metric to compare the generated curriculum against the reference/proposed curriculum 5. The above steps can be repeated with modified parameters to get better insights and curriculum.
This also gives us an idea of how we can have an evolving curriculum that can help us bridge the gap between industry and academia.
Have you ever had issues to share your Jupyter Notebooks? Ever had troubles with code that "works on my machine" only? Do you consider your Research and Development smooth and straightforward? Is your code scalable? Tough questions, I know. But if you've mentally answered 'no' to any of those you could use a tool to help with some of the pain-points of your workflow. Kedro is an open-source Python library that helps data scientists write data pipelines following software engineering best practices from the start. Known as the Django of ML/DS projects, Kedro is an opinionated framework based on cookiecutter data science that allows for modularity and scalability on data science projects.
In this talk, I will explore the workflow of a Kedro project, introduce some of the most outstanding features of the framework, such as the Data Catalog and show how to convert a Jupyter Notebook into a Kedro project, allowing for scalability and team collaboration.
Talk structure
Intro (5 min)
The problem(s) (10 min)
The solution (5 min)
Demo - convert Notebook to Kedro project (15 min)
Q&A (5 min)
Audience
This talk focus on data engineers, machine learning engineers, and data scientists who wish to learn how to write code beyond the Jupyter Notebook. The audience is expected to know the basics of Python and Jupyter Notebooks. All levels are welcome. Key takeaways
By the end of this talk, the attendees are expected to understand the basics setup of a Kedro project, know how to convert a Jupyter Notebook into a Kedro project, and to visualize the created data pipelines using the Kedro Viz extension.
While iterating rapidly on Python code, we want to see the result of our changes rapidly. In this talk, we will review the different techniques available to reload Python code. We will see how they work and when each is the best fit.
The talk will cover both cold and hot reload techniques:
Cold reload techniques reset the application state between each reload. Examples include Django and Flask's autoreload tools.
Hot reload techniques keep the application state despite the code changing. These include Jupyter kernels and 'reloadr' [1], an open-source module developed by the speaker to allow stateful hot code reloading.
This talk will provide practical insights on high performance GPU computing in Python using the Vulkan Kompute framework. We will cover the trends in GPU processing, the architecture of Vulkan Kompute, we will implement a simple parallel multiplication example, and we will then dive into a machine learning example building a logistic regression model from scratch which will run in the GPU.
In more detail these are the topics of the talk:
• Motivations
• High level overview of the OSS Vulkan initative enabling cross-vendor GPU computing
• The Python Kompute Framework and its architecture which augments Vulkan
• A simple Python Kompute example implementing a parallel array multiplication
• An advanced Python Kompute example implementing a parallel array multiplication
• Further resources & further reading
A more in-depth version of this talk can be found in this blog post:
• https://towardsdatascience.com/beyond-cuda-gpu-accelerated-python-for-machine-learning-in-cross-vendor-graphics-cards-made-simple-6cc828a45cc3
If you are interested in the C++ internals, as well as further performance optimizations, you can join the deeper dive at the HPC & Data Science Room:
• https://fosdem.org/2021/schedule/event/gpu_vulkan/
Other useful links:
• Vulkan Kompute Repo: https://github.com/EthicalML/vulkan-kompute • Vulkan Kompute Docs: https://kompute.cc/
When someone starts learning about classes in Python, one of the first things they'll come across is "self" in the parameter list of a method. To keep it simple, it's usually explained that Python will automatically pass the current instance as the first argument to the method: "self" will refer to the instance the method was called on. This high-level explanation really helps with keeping the focus on learning the basics of classes, but it also side-steps what is really going on: It makes it sound like process of inserting "self" is something automagical that the language just does for you. In reality, the mechanism behind inserting self isn't magical at all and it's something you can very much play with yourself.
In this intermediate level talk, Sebastiaan Zeeff will take you down into the heart of the Python data model to explain how the mechanism behind inserting "self" works. He will talk about the descriptor protocol and how it allows you to modify how attributes are accessed, assigned, or deleted in Python. He hopes that understanding how descriptors work will demystify "self" in Python.