Skip to content
Data Science Horizons

Data Science Horizons

Navigating the Data Frontier: Explore the World of Data Science Today

  • Crash Courses
  • eBooks
  • Practical Guides
Data Science Horizons

Data Science Horizons

Navigating the Data Frontier: Explore the World of Data Science Today

  • Crash Courses
  • eBooks
  • Practical Guides
Latest
  • A Practical Guide to Writing a Python Command Line Script

    1 year ago1 year ago
  • Create a SQL REPL for JSON Files in Python

    1 year ago
  • How to Become a Data Engineer in 2025

    1 year ago
  • A Comprehensive Overview of Prompt Engineering Techniques

    1 year ago1 year ago
  • A Comprehensive Overview of RAG Strategies

    1 year ago1 year ago
  • A Practical Guide to Concurrency and Parallelism in Python

    1 year ago1 year ago
  • What is Data Science? A Beginner’s Guide

    2 years ago1 year ago
  • Advanced File Handling in Python: Working with CSV, JSON, and XML

    2 years ago
  • Building Python CLI Applications: A Step-by-Step Tutorial

    2 years ago
  • 5 Tips for Writing Efficient Python Code for Data Analysis

    2 years ago2 years ago
  • Data Engineering

Database Normalization: A Practical Guide

Team DSH3 years ago1 year ago08 mins

Database normalization is the process of organizing data in a database to reduce data redundancy and improve data integrity. This practical guide covers the basics of normalization, including the different normal forms such as 1NF, 2NF, and 3NF.

Read More
  • Data Engineering

Understanding Data Sharding

Team DSH3 years ago3 years ago06 mins

Data sharding is a fundamental technique in modern database management, providing the means to enhance system performance, scalability, and reliability. This article aims to explore the core principles and practices of data sharding, illuminating the pathway to effective data distribution.

Read More
  • Crash Course
  • NLP

spaCy Crash Course for Data Scientists

Team DSH3 years ago1 year ago09 mins

This crash course is designed to provide an in-depth guide to spaCy, an open-source Python library built specifically for advanced NLP. Learn to harness this powerful library for your NLP tasks now.

Read More
  • Data Engineering

An Overview of Data Virtualization

Team DSH3 years ago3 years ago05 mins

Data virtualization is a software layer that allows applications to access data from various sources without requiring the data to be moved or copied. It connects data consumers with data sources in real-time. The article provides an introduction to data virtualization concepts, benefits, use cases, architectures, and leading products.

Read More
  • Data Engineering

Deploying a Data Engineering Project to Production: A Checklist

Team DSH3 years ago3 years ago04 mins

This article provides a checklist of steps and considerations when deploying a data engineering project to production, covering infrastructure setup, testing, monitoring and more. Following this checklist will help ensure a smooth deployment and transition to production systems.

Read More
  • General

Is Feature Engineering a Dying Art?

Team DSH3 years ago3 years ago04 mins

Manual feature engineering remains an integral skill. A hybrid approach combining automation with human fine-tuning offers the ideal path forward.

Read More
  • Machine Learning

PyTorch: A Quick & Dirty Intro

Team DSH3 years ago3 years ago07 mins

This article provides a hands-on introduction to PyTorch, covering installation, building a simple linear regression model, data preparation, training, evaluation, and further resources.

Read More
  • Crash Course
  • Data Engineering

Docker Crash Course for Data Scientists

Team DSH3 years ago1 year ago011 mins

This Docker crash course for data scientists covers Docker fundamentals like architecture, images, containers, storage, networking. It then explores using Docker for data science workflows including environments, model training/deployment, notebooks. Finally it discusses best practices for optimization, orchestration, security, and monitoring.

Read More
  • Machine Learning

Handling Categorical Variables in scikit-learn: Strategies and Encoding Techniques

Team DSH3 years ago3 years ago05 mins

Categorical variables must be encoded before use in scikit-learn models. This article covers 3 of the core strategies and best practices for handling categorical features in machine learning with code examples.

Read More
  • Machine Learning

An Overview of Feature Selection Techniques in scikit-learn

Team DSH3 years ago3 years ago04 mins

This article provides an overview of the main feature selection techniques available in scikit-learn.

Read More
  • 1
  • 2
  • 3
  • 4
  • 5
  • …
  • 9