The DISTINCT clause is used in SQL to remove duplicate rows from the result set and return only unique records.


Basic Syntax:

SELECT DISTINCT column1, column2, ...
FROM table_name
WHERE condition
ORDER BY column;
  • column1, column2, ...: The columns to select with duplicates removed.
  • WHERE condition: Optional filtering condition.
  • ORDER BY column: Optional sorting.


Examples:

  1. Selecting Unique Supplier IDs:

    SELECT DISTINCT SupplierID
    FROM Products;
    • This returns a list of unique SupplierID values from the Products table, removing any duplicates.
  2. Selecting Unique First Names:

    SELECT DISTINCT firstName
    FROM Person
    ORDER BY firstName;
    • This returns distinct first names in alphabetical order, ignoring duplicates.
  3. Counting Unique Values:

    SELECT COUNT(DISTINCT firstName) AS uniqueFirstNames
    FROM Person;
    • This counts the number of distinct first names in the Person table.

Count + Distinct

في SQL Server، عشان تعمل COUNT للقيم المميزة (Distinct) في عمود معين، ممكن تستخدم دالة COUNT مع الكلمة المفتاحية DISTINCT. الكود هيبقى كالتالي:

SELECT COUNT(DISTINCT column_name) AS DistinctCount
FROM table_name;

شرح الكود:

  • COUNT(DISTINCT column_name): بتحسب عدد القيم المميزة (بدون تكرار) في العمود.
  • AS DistinctCount: بتدي اسم للنتيجة (اختياري).
  • FROM table_name: بتحدد الجدول اللي فيه العمود.

مثال عملي:

لو عندك جدول اسمه Employees فيه عمود Department، وعايز تعرف عدد الأقسام المختلفة:

SELECT COUNT(DISTINCT Department) AS DistinctDepartments
FROM Employees;

هيطلعلك عدد الأقسام المختلفة في الجدول Employees.

Key Considerations:

  • NULL Values: DISTINCT considers all NULL values as the same, so only one NULL value will be returned.

  • Performance: DISTINCT is typically faster than using GROUP BY when you only need unique values and not aggregate functions like SUM, COUNT, etc.

  • DISTINCT Across Multiple Columns: When used with multiple columns, DISTINCT returns unique combinations of those column values.

    SELECT DISTINCT gender, firstName
    FROM Person;

DISTINCT vs. GROUP BY:

  • DISTINCT: Faster when you need unique values from selected columns without performing aggregate calculations.
  • GROUP BY: More powerful, used for grouping rows and calculating aggregates like SUM, AVG, COUNT, etc., but slower than DISTINCT for just removing duplicates.

Handling Duplicates with Multiple Columns:

When applying DISTINCT to multiple columns, it treats the combination of those columns as unique values:

SELECT DISTINCT gender, firstName
FROM Person
ORDER BY gender, firstName;

This removes duplicate combinations of gender and firstName.


Important Notes:

  • DISTINCT does not consider the entire row for uniqueness unless all specified columns are included.
  • The result will show distinct combinations of values from the columns listed in the SELECT clause.
  • If you’re only interested in eliminating duplicates for a single column, DISTINCT can be very useful and efficient compared to GROUP BY.

In summary, DISTINCT is a simple and efficient way to remove duplicate records from a result set based on one or more columns, providing a list of unique values or combinations of values.