Curious about Data Science and Machine Learning but don't know where to start? Join us for a free webinar that cuts through the jargon and provides a clear, accessible introduction to these exciting fields. In this 1-hour session, attendees explore the typical data science workflow and gain an introduction to popular Python libraries through demonstrations with real-world datasets. Join us as data science/ML expert and seasoned trainer, Kevin Clements, guides you through:
- What data science is and how it's used to extract insights and knowledge from data.
- The core concepts of machine learning, including different types of algorithms and their applications.
- The typical data science workflow, from collecting and cleaning data to building and evaluating models.
- A live demonstration with popular Python libraries like Pandas and Scikit-learn, essential tools for any aspiring data scientist.
- Real-world examples of data science and machine learning in action, demonstrating the power of these techniques.
At the conclusion, Kevin will answer your questions.
Who Should Attend?
This webinar is designed for beginners with little to no prior experience in data science or machine learning. If you're curious about these fields and want to see what they're all about, this session is perfect for you. You'll leave with a solid understanding of the fundamentals and a taste of what's possible with data science and Python.
About the Presenter, Kevin Clements
As the CEO of Claremont AI, Kevin Clements brings over 35 years of experience in the technology and communications sectors. For nearly three decades, he has empowered organizations of all sizes to streamline their requirements, develop efficient strategies, and implement effective software solutions. According to CFODive, 28% of organizations in the US are using Generative AI tools like ChatGPT to deal with the increasing complexity of data store systems and increase productivity. Kevin recognizes the critical role of Data Science and Machine Learning tools in modern data management practices. His decade of expertise in this area ensures that clients receive tailored solutions that drive success in a rapidly evolving landscape.
Amazon Web Services (AWS) offers a comprehensive suite of services for running data science workloads in the cloud at scale. These services include Amazon SageMaker, a pre-built suite of machine learning (ML) tools comprised of Textract, Polly, Lex, Comprehend, AWS Glue, Amazon EMR, and Amazon EC2.
In this 1-hour webinar, senior AWS instructor and consultant Michael Forrester discusses the benefits of transitioning data science workloads to AWS and walks through the steps involved in the migration process. By the end of this webinar, attendees will have a solid understanding of how to migrate their data science workloads to the AWS cloud.
In this 1-hour session, Michael will cover how to:
- Decide among the various options for developing, managing, and deploying data science workloads in AWS
- Transition on-premises notebooks to AWS
- Store data used in model development
- Obtain real-time and batch predictions
- General steps to earn the AWS Certified Machine Learning – Specialty certification
Audience
Decision makers and/or those in technical roles who want to understand the options and tools that AWS has related to moving Data Science Workloads to AWS.
Prerequisites
Basic understanding of ML/AI workloads, including sentiment analysis, probability forecasting, and a basic understanding of AWS and its core services.
Duration: 60 minutes
Cost: Free!
About the Presenter:
Michael Forrester is an Infrastructure Engineer with over 20 years of experience in the IT industry, specializing in all things DevOps. As an Amazon Authorized Instructor and trainer for almost a decade, he is deeply committed to cultivating DevOps expertise, aiming to optimize software delivery through efficiency, speed, and transparency. A big believer in people as the enabler, his extensive experience in DevOps also includes MLOps and AIOps, positioning him as a comprehensive authority in both traditional and emerging IT operations disciplines.
Artificial intelligence (AI) deals with creating systems that can reason, learn, and act autonomously. AI has been around for decades, but it has only been in the last few years that AI has really started to take off due in part to the increased availability of data and the development of new algorithms. AI has the potential to revolutionize many industries and has many benefits, but it also comes with ethical concerns and risks for organizations wishing to implement it into their processes.
Join our complimentary AI webinar to discover more about the building blocks that lie beneath today’s AI products and explore how they work at a conceptual level. We’ll also investigate how algorithms are applied in recent AI products and discuss the feasibility of using these AI applications in your projects. In this 1-hour webinar, Chris Penick examines the benefits and challenges that AI integration may bring to your organization and discusses:
- The foundational algorithms of AI and machine learning
- How to integrate AI into development and business products
- What ethical implications exist on the use or restriction of AI
- How to use AI and ML safely and productively
At the conclusion, Chris will welcome your questions.
Audience:
Anyone interested in learning about AI. No AI experience is assumed.
Duration: 60 minutes
Cost: Free!
About the Presenter:
Chris Penick, Stream Lead for Web Age Solutions, has over thirty years of experience in the IT industry across a variety of platforms. Chris has guided clients in cybersecurity, architecture, web development, and data science. He holds multiple certifications in security and development from Microsoft, CompTIA, ISC2, and EC Council. Chris teaches many courses for Accelebrate and Web Age, including Terraform, Kubernetes, Data Science with Python, AI, Java, and more.
Python is a powerful yet quick-to-learn programming language, used to mine data to find patterns and “hidden gems”, build recommendation systems, and automate the discovery and analysis process. The Python community has created many free and open-source packages of code and tools to work with data engineering, analysis, and visualization.
During this one hour webinar we’ll explore some of the most used Python data science libraries such as pandas, NumPy, matplotlib, and seaborn, and how they can improve operational efficiency and productivity through:
- Open source, no licensing
- Data mine text, (not an Excel strong point)
- Reusable “chunks” of analysis, business custom functions, and libraries that are easy to share
Usable across multiple sources of data, interoperable on different hardware, software, OS, cloud platforms, and supportive of AI, ML, Big Data tasks – (Excel is limited in this), are just a few of the benefits we will discover together during this presentation.
Optimization helps improve the efficiency of a system leading to a best solution. This webinar will look at effective techniques used across major industries today, using optimization techniques to successfully work with large-scale data on-prem, and in the cloud.
Topics:
• Common analytical questions and Tableau’s sweet spot
• Basic visualization functionality
• Dashboards and stories
• Accessing data
• Visualizations beyond ‘Show Me’ – waterfall charts, racing charts & polygon mapping
The ever-growing data lake, large structured and unstructured data sets organizations are faced with today requires a role which combines knowledge and skillset in areas of computer science, statistics, and mathematics. A well-trained data scientist must be comfortable with the many tools and techniques that assist to analyze, process, and model data constantly. They must use knowledge of industry, gain understanding, and doubt existing conventions to uncover solutions to business challenges from sources such as raw files, smart devices, social media, and other datasets that don’t generally fit into a database. Then interpret the results to create actionable plans for their organizations by utilizing both technology and social science to find trends, and manage data.
A data engineer is responsible for the development and maintenance of data pipelines from the ever-growing data lake, large structured and unstructured data sets in the organization. The goal of data engineering is to architect and build pipelines that provide functionality, speed, scalability, and the reliability required by the organization to use data effectively. Data engineers utilize the various stages in a pipeline from acquisition and transport, to storage, processing and servicing continually improving their methods and practices. Today’s Data Engineer must become proficient at programming, learn automation and scripting, understand may different data stores, master data processing techniques, efficiently schedule workflows, know the ever changing cloud landscape, and keep up with trends.
This comprehensive webinar will delve into today’s best tools and techniques that great data scientists utilize to efficiently and effectively understand outcomes from their datasets, and capture, transform and shape their data stores.
Some of the Data Science tools explored – demo PySpark pulling csv file into spark database, python scikit-learn, AWS S3 csv file to database Glue and Athena, QuickSight visualization Some of the Data Engineering tools explored – demo python pulling csv file into database, resolve missing data, show merging AWS S3 csv files into database Glue and Athena SQL.
In this webinar, we’ll give an introduction to Monte Carlo simulation methods for stochastic equations in the context of estimating portfolio risk. We’ll use Python with Pandas, NumPy, SciPy, and Seaborn to explore expected portfolio payout risk, given the size of the portfolio and underlying assumptions about the distribution of the rate and cost of events.
In this webinar, we’ll discuss the capabilities of Python and PySpark, and the role they play in the data engineering space, both on a single machine, or on a cluster.
Are you also perturbed about Scalability, Performance and Cost of your existing Data Warehouses?
We have a solution for you!
Keep your gung-ho high, and join our live webinar on Snowflake. Snowflake provides Data Warehousing as a Service. It opens the door to numerous benefits including almost Zero maintenance, On Demand Scaling in just a few seconds, Simplifying Data Sharing, and Zero Copy Cloning etc.
Topics:
1. Evolution of Data Warehousing Technologies
2. What is Snowflake?
3. Snowflake vs RedShift
4. Key Concepts
5. Snowflake Architecture
6. Setting up Snowflake Trial Account
7. Demo
In this webinar we’ll discuss some of the aspects of Apache Airflow as related to workflow management. Airflow aims at taking the classical concept of a scheduled ETL process to the next level with the added value in the form of re-tries, backfills, scaling, and more.
In this webinar we’ll discuss some of the topics related to Robotic Process Automation (RPA). RPA is an approach to process automation of repetitive tasks and workflows using “bots” (robots / smart agents), which goes a long way to unburden staff from dehumanizing “robotic” office activities.
In this webinar we’ll review Splunk Platform’s capabilities for data onboarding and searching.
Chapter 1. Defining Data Science
Chapter 2. Data Processing Phases
Chapter 3. Data Science and Machine Learning Terminology and Algorithms
Chapter 4. Repairing and Normalizing Data
Chapter 5. Data Visualization
In this webinar we will review common tasks that can be solved with PySpark. Examples will include SQL (DataFrame)-centric processing, creating pivot tables, and EDA.
In this webinar, we’ll review the core capabilities of Python that enable developers to solve a variety of data engineering problems. We’ll also review NumPy and pandas libraries, with a focus on such topics as the need for understanding your data, selecting the right data types, improving performance of your applications, common data repairing techniques, and so on.
In this webinar we will review the core capabilities of PySpark as well as PySpark’s areas of specialization in data engineering, ETL, and Machine Learning use cases.
This course gives managers a foundation for leading data-driven projects. It is an introduction to the major concepts of data-driven teams and data specific project challenges. This is not a data analytics or programming course. This is a project management course. The focus of the course will be on managing a data-driven team and gaining insights from your data.The goals for the course are very practical:
• Introduce project managers to data science terms, tools and the team
• Introduce project managers to a data-science lifecycle
• Understand challenges that are specific to data-driven projects
Audience: Project Managers, Business Analysts, Managers, Directors
Course Outline:Introduction to Big Data
• Big Data History
• What is Big Data?
• Big Data Definition
• What Big Data Isn’t
• Big Data ExampleThe Data-Science Lifecycle
• A Typical Data Science Product
• What are Big Data Projects?
• Applying the SDLC to Data Driven Products
• A Data Science Lifecycle (DSLC)
The Data Science Team
• Traditional Project Team Roles
• The Data Science Team Roles
• The Knowledge Explorer
• Analysis Versus Reporting
• Asking Questions
• Learning
Data-Science Team Tools
• Insight Board
• Creating an Insight Board
In this webinar, we will talk about how you can use Python in the realm of applied data science. Python leverages specialized libraries like NumPy, pandas, and scikit-learn to aid data science practitioners in solving problems related to regression, sample classification, clustering and other data science problems.
This one hour webinar will introduce the participant to an assortment of machine learning technologies. We will start with a definition of Artificial Intelligence then move into the relationship between Artificial Intelligence and Neural Networks. Next we will explore the three types of Machine Learning (Supervised, Unsupervised and Reinforcement). We will also address common indicators used to define the quality of a machine learning model. The session wraps up with brief demonstration of a custom built python model as well as demonstrations using the 4 major AI vendors (Amazon, Google, IBM & Microsoft).
Machine Learning can help businesses reengineer their processes for higher revenue, higher customer satisfaction and lower cost. In this short course we will explore how exactly machines learn. We will then investigate a few AI techniques that businesses can use today to improve their performance.
R has won itself a solid reputation in the data analysis realm. In this webinar we will talk about some aspects of this programming language and its libraries that make it an indispensable tool for data science projects.
To stay competitive, organizations have started adopting new approaches to data processing and analysis. For example, data scientists are turning to Apache Spark for processing massive amounts of data using Apache Spark’s distributed compute capability and its built-in Machine Learning (ML) libraries. In this webinar, we will talk about Spark’s machine learning libraries and select ML algorithms.
Talend has been recognized by Gartner as a Leader in open source integration solutions. This webinar focuses on providing a quick introduction to Talend Unified Platform. This unified platform enables and empowers organizations to focus on their business objectives by minimizing effort/cost w.r.t. Data Integration, ETL, Data Quality, Master Data Management, Application Integration and Business Process Management.
Apache NiFi is a distributed scalable framework for data routing, transformation, and system mediation. In this webinar, you will learn all you need to know about this versatile framework so you can start using it in your data transfer and data transformation projects.
Apache Kafka is a distributed, scalable, and fault-tolerant platform for handling real-time large data feeds, which offers near network processing speeds. In this webinar, we will review Kafka’s architecture and ways to interface with it.
This webinar will introduce MongoDB, a database for the modern world. MongoDB is not an RDBMS instead it is a NOSQL database. Data is stored in BSON format, a binary equivalent to JSON and it is designed to be scalable to support massive data sets. MongoDB also supports sharded and allows processes like Map/Reduce to be implemented across clustered servers.
• Applied Data Science and Business Analytics
• Algorithms, Techniques and Common Analytical Methods
• Machine Learning Introduction
• Visualizing and Reporting Processed Results
• The R Programming Language
• Elements of Functional Programming
• Apache Spark Introduction
In this webinar, we will talk about some of the popular data science algorithms, analytical methods, tools and systems used in the desktop environment as well as in the clustered environments (Hadoop).
Are you hearing a LOT about Apache Spark? Find out why in this 1-hour webinar which addresses:
• What is spark
• Why so much talk about Spark
• How does spark compare with MapReduce
• How does Spark fit in with the rest of the Hadoop Ecosystem
In this webinar, participants will learn about: • MapReduce Programming Model
• Data Querying and Processing with Hadoop
• Amazon Elastic MapReduce
• Google App Engine
• MongoDB Query Language
• HiveQL
The term “Microservice” has recently been coined to describe a rapidly provisionable, independently deployable service with narrow and distinct functionality that is accessible over common communication protocols. As is often the case with new IT technology terms, the term Microservice is viewed by many as too fuzzy and misleading similar to terms NoSQL and Big Data. In this webinar, we will try to see if we can make sense of it.