SQL Server Hosting - HostForLIFE :: Change Data Capture - CDC with SQL server

March 22, 2024 07:29 by author

What is SQL Server CDC?
SQL Server CDC, or Change Data Capture, is a technology used by Microsoft SQL Server to systematically record all changes made to a database. It records events such as inserts, updates, and deletes within a specific timeframe, as well as any changes to the database structure. CDC, unlike SQL Server Change Tracking, scans transaction logs for data changes and stores the results in a separate table or database. Notably, CDC uses minimum server resources and does not delay transactions with locks or reads, distinguishing it from other tracking approaches.

The use of SQL CDC functionality is especially useful when transferring or replicating data from a SQL Server database to data lakes or warehouses such as S3, Redshift, Azure Synapse, SQL Server, Databricks, PostgreSQL, BigQuery, or Snowflake. This feature enables effective replication of incremental data, ensuring that the data lake or warehouse is consistently updated, and it also serves as a centralized hub for accessing data from many SQL Server repositories.

SQL Server, a relational database management system built by Microsoft, is a critical tool for efficiently storing and retrieving data, and it is widely used by applications for this purpose. Notably, SQL Server versions after 2008 include Change Data Capture (CDC) capabilities.

SQL Server Change Data Capture or CDC points to note

Enabling this functionality requires sysadmin privileges. (Data Migration 101).
This functionality is not supported in SQL Server Web, Express, and Standard editions when migrating to Databricks. (Easy Migration Method).
CDC must be enabled both at the database level and the table level for all relevant tables. (Debezium CDC Explained).
To utilize Change Data Capture, you must initiate the SQL Server Agent. (Change Data Capture Types and CDC Automation)

How to enable CDC in SQL Server?
At DB level
To activate Change Data Capture (CDC) for a particular table to facilitate auditing, the initial step involves enabling CDC at the database level. This action necessitates a member of the SYSADMIN fixed server role to utilize the sys.sp_cdc_enable_db system stored procedure.
Use <databasename>;
EXEC sys.sp_cdc_enable_db;

Example

At Table level
Create a Table. example, person.
CREATE TABLE [dbo].[Persons](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [FirstName] [nvarchar](max) NOT NULL,
    [LastName] [nvarchar](max) NOT NULL,
    [SSN] [nvarchar](max) NOT NULL,
      NOT NULL,
    [DOB_sys] [date] NOT NULL,
      NOT NULL,
    [CurrentTime_sys] [datetimeoffset](7) NOT NULL,
    CONSTRAINT [PK_Persons] PRIMARY KEY CLUSTERED
    (
        [Id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

Example

Once CDC is enabled at the database level, subsequent activation for tracking and auditing DML changes on specific tables can be conducted by a member of the db_owner fixed database role. This is achieved through the sys.sp_cdc_enable_table system stored procedure, offering options to customize the tracking process:

Captured Columns: By default, all columns in the source table are tracked. However, for privacy or performance considerations, a subset of columns can be specified using the @captured_column_list parameter.
Change Table Filegroup: The change table is typically placed in the default filegroup of the database. However, for control over placement, the @filegroup_name parameter allows specifying a particular filegroup for the change table. It's advisable to allocate change tables to a separate filegroup from source tables for better management.
Access Control: Access to the change data can be regulated using a designated role, which can be an existing fixed server role or a database role. If the specified role doesn't exist, it's automatically created. Users require SELECT permission on all captured columns, and if a role is specified, users outside of sysadmin or db_owner roles must also be members of the specified role.
Support for Net Changes: If @supports_net_changes is set to 1, a net changes function is generated for the capture instance. This function returns only one change for each distinct row changed within a specified interval. To enable net changes queries, the source table must have a primary key or unique index, and if an index is used, it must be specified using the @index_name parameter.
Primary Key Handling: If change data capture is enabled on a table with an existing primary key, subsequent changes to the primary key are not allowed without first disabling CDC. If there is no primary key at the time of enabling the CDC, any subsequent addition of a primary key is ignored.

Overall, these options offer flexibility in configuring CDC to suit specific auditing requirements and access control policies within the database environment.
USE <databasename>
GO

EXEC sys.sp_cdc_enable_table
@source_schema = ‘<schema_name>’,
@source_name = ‘<table_name>’,
@captured_column_list = ‘<column_list>,’
@supports_net_changes = 0,
@filegroup_name = ‘’
@role_name = null,
@supports_net_changes = 0;

Example

In SQL Server, how can I verify if CDC is enabled on a table?
Verify that the connection user has access; this query should not return an empty result.

EXEC sys.sp_cdc_help_change_data_capture

Check if CDC has been enabled at the database level.

USE master
GO

select name, is_cdc_enabled
from sys.databases
where name = ‘<databasename>’

GO

If is_cdc_enabled is 0 then CDC is not enabled for the database.
If is_cdc_enabled is 1 then CDC is already enabled for the database name.

Check if CDC is enabled at the table level.
USE databasename
GO

SELECT name, type, type_desc, is_tracked_by_cdc
FROM sys.tables
WHERE name = '<table_name>'

GO

If is_tracked_by_cdc = 1 then CDC is enabled for the table.
If is_tracked_by_cdc = 0 then CDC is not enabled for the table.

Once, CDC is enabled at the Database level, you will see that a schema with the name ‘cdc’ has been created.

Once CDC is enabled on the table, a number of system tables will be created under the CDC schema of the database to store the CDC-related information. These tables include the following

CDC.captured_columns table that contains the list of captured columns.
CDC.change_tables table that contains the list of tables that are enabled for capture.
CDC.ddl_history table that records the history of all the DDL changes since capture data was enabled.
CDC.index_columns table that contains all the indexes that are associated with the change table.
CDC.lsn_time_mapping table that is used to map the LSN number with the time and finally one change table for each CDC-enabled table that is used to capture the DML changes on the source table, as shown below.

and the SQL Agent jobs associated with the CDC-enabled table, the capture and cleanup jobs, will be created like below:

For Change Data Capture (CDC) to operate effectively, it relies on the presence of a running SQL Server Agent within the SQL Server instance. Upon enabling CDC for a database table, the system automatically generates two SQL Server Agent jobs tailored for that specific database.

The first job is responsible for populating the database change tables with pertinent change information retrieved from the transaction log.
The second job is tasked with maintaining the cleanliness of the change tables by routinely purging records older than a predefined retention period, typically set to 3 days.

CDC's functionality hinges on the SQL Server Transaction Log as its primary data source for detecting and capturing changes. Whenever a modification occurs within the database, this alteration is promptly recorded within the Transaction Log file.

Furthermore, it's noteworthy that upon job creation, they are automatically activated, ensuring the seamless operation of the CDC without the need for manual intervention.

When CDC is enabled on any table, it usually captures the data of all the columns. During INSERT or DELETE operations, it is necessary to capture all the data, but in UPDATE operations, only the data of the updated columns are required. CDC is not yet advanced enough to provide this kind of dynamic column selection, but CDC can let you select the columns from which data changes should be captured from the beginning.

Before we start, let us select from a couple of tables and observe their content.
select * from dbo.Persons;
select * from [cdc].[dbo_Persons_CT];
select * from [cdc].[captured_columns];
select * from [cdc].[change_tables];
select * from [cdc].[ddl_history];
select * from [cdc].[index_columns];

Column _$operation of table CDC.captured_columns table contains a value that corresponds to DML Operations. Following is a quick list of values and their corresponding meaning.

Delete Statement= 1
Insert Statement= 2
Value before Update Statement= 3
Value after Update Statement= 4

It is important to understand the Update mask column in the tracking table. It is named _$update_mask. The value displayed in the field is hexadecimal but is stored as binary.

In our example, we have three different operations. INSERT and DELETE operations are done on the complete row, not individual columns. These operations are listed marked masked with 0x1F translated in binary as 0b11111, which means all five table columns.

In our example, we had an UPDATE on only two columns – the second and fifth. This is represented with 0x12 in hexadecimal value ( 0b10010 in binary). Here, this value stands for the second and fifth values if you look at it from the right as a bitmap. This is a valuable way of determining which columns are being updated or changed.

The tracking table shows two columns containing the suffix lsn, i.e., _$start_lsn and _$end_lsn. These two values correspond to the Log Sequential Number. This number is associated with the committed transaction of the DML operation on the tracked table.

Now, let's try to Insert some data into the table and check, what all the data above tables returns.
INSERT INTO [dbo].[Persons]
           ([FirstName],[LastName],[SSN],[InsertDte_sys],[DOB_sys],[WorkStartsAt_sys],[CurrentTime_sys])
     VALUES
           ('test1'
           ,'test2'
           ,'123'
           ,'2024-01-13 18:57:58.5614420'
           ,'2024-01-13'
           ,'18:57:58.5614498'
           ,'2024-01-13 18:57:58.5614499 +05:30');

Once, Data is inserted into dbo. Person table, we can see a row added to that table. As part of CDC enabled on this table, even CDC.dbo_persons_ct table got a row added with the operation. 2 which is for Insert operation and it captured the CDC-enabled columns.

Now, let's try to Update some values in the Person table and check the result in the CDC table.
update [dbo].[Persons]
SET [FirstName] = 'test_test'
where Id = 1;

After updating the value in dbo. person table, the same was reflected in the table. But in the CDC table, 2 rows were added with operations,3,4. So whenever an Update happens, 2 rows will be captured as part of the CDC process.
Value before Update Statement= 3

Value after Update Statement= 4

Now, let's try Delete the value and check the CDC table.
Delete from dbo.Persons where Id = 1;

Once, data is deleted from the table, the CDC table will have a row added with option,1 which is for the delete operation.

Now, let us try to Alter the table and check how the CDC will work.

ALTER TABLE dbo.Persons
ADD Email varchar(255);

Once, the table is altered the CDC.ddl_history table captures the schema update.

Retrieve Captured Data of Specific Time Frame
Often, one is asked for data to be tracked over a time interval. If you look at the tracking data, no time is captured. It always provides all the information. However, a few fields can help us, i.e., _$start_lsn. LSN stands for the Last Sequence Number. An LSN uniquely identifies every record in the transaction log. They are always incrementing numbers.

LSN numbers are always associated with time, and their mapping can be found after querying system table cdc.lsn_time_mapping. This table is one of the tables that was created when the Person database was enabled for CDC. You can run this query to get all the data in the table cdc.lsn_time_mapping.
SELECT * FROM cdc.lsn_time_mapping

When this query is run, it will give us all the rows of the table. Finding the necessary information from all the data is a little difficult. The usual case is when we must inspect a change that occurred in a particular period.

We can find the time that corresponds to the LSN by using the system function sys.fn_cdc_map_time_to_lsn. If we want all the changes done yesterday, we can run this function as described below, and it will return all the rows from yesterday.

Querying the change tables is not recommended by Microsoft. Instead, you can query the CDC.fn_cdc_get_all_changes system function associated with the CDC-enabled table. Before we run this query, let us explore table-valued functions (TVF) in the database. You can see that there are new TVFs created with schema cfc. These functions were created when table-level CDC was enabled.

The function cdc.fn_cdc_get_all_changes_dbo_Persons can be used to get events that occurred over a particular period. You can run this T-SQL script to get events during any specific period.

The following query should retrieve the data modified in the past 24 hours.
USE person
GO
DECLARE @begin_time DATETIME, @end_time DATETIME, @begin_lsn BINARY(10), @end_lsn BINARY(10);
SELECT @begin_time = GETDATE()-1, @end_time = GETDATE();
SELECT @begin_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than', @begin_time);
SELECT @end_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @end_time);
SELECT *
FROM cdc.fn_cdc_get_all_changes_dbo_Persons(@begin_lsn,@end_lsn,'all')
GO

We have used relational operations in the function sys.fn_cdc_map_time_to_lsn. There can be a total of four different relational operations available to use in that function:

largest less than
largest less than or equal
smallest greater than
smallest greater than or equal

This way, the captured data can be queried easily and query based on time intervals.

Disabling CDC
Disabling this feature is very simple. As we have seen earlier, if we have to enable CDC, we have to do this in two steps – at the table level and the database level; In the same way, when we have to disable this feature, we can do this at the same two levels. Let us see both of them one after one.
Disabling Change Data Capture on a table

For dropping any table tracking, we need three values: the Source Schema, the Source Table name, and the Capture Instance.

However, we do not know the name of the Source Schema, the Source Table name, and the Capture Instance. We can retrieve it very easily by running the following T-SQL Query.
USE database;
GO

EXEC sys.sp_cdc_help_change_data_capture
GO

This will return a result that contains all the three required information for disabling CDC on a table.

The Change Data Capture can be easily disabled on a specific table using the sys.sp_cdc_disable_table system stored procedure, as shown below.
USE person

EXEC sys.sp_cdc_disable_table
@source_schema = 'dbo',
@source_name = 'Persons',
@capture_instance = 'dbo_Persons'

Once Change Data Capture is disabled on any table, it drops the Change Data Capture table as well as all the functions that were associated with it. It also deletes all the rows and data related to this feature from all the system tables and changes relevant data in catalog views.

The CDC.dbo_persons_ct table has been deleted.

Disabled CDC at the database level

disabled completely at the database level, without the need to disable it on CDC-enabled tables one by one, using the sys.sp_cdc_disable_db system stored procedure, as shown below:
USE person
GO

sys.sp_cdc_disable_db
GO

Once CDC is disabled at the Database level, then all tables, and functions related to CDC will be deleted.

SQL Server Change Data Capture, Pros and Cons
Challenges of SQL Server CDC

While SQL Server Change Data Capture (CDC) offers valuable functionality for tracking data changes, it does come with some limitations. Although it utilizes I/O and incurs a minor overhead on server resources, it doesn't directly impact the tables themselves, nor does it introduce locks or reads that could impede transaction performance. However, in scenarios where tables experience high volumes of DML activity, SQL Server CDC may not be the most efficient solution for real-time data replication.
Setting up CDC can be more complex compared to alternative mechanisms, and ongoing intervention by database administrators may be necessary, particularly for managing transaction logs in transactional replication scenarios.
It's important to note that CDC functionality is not supported in SQL Server Web, Express, and Standard editions. Therefore, organizations exploring alternatives for SQL Server to Snowflake migration or considering Postgres CDC should be aware of these limitations and explore other options accordingly.

Benefits of SQL Server CDC

One of the significant advantages of SQL Server Change Data Capture (CDC) is its flexibility in supporting tables without a primary key. Unlike many other replication mechanisms from SQL Server that necessitate tables to have a primary key, CDC can efficiently handle tables lacking this constraint. This capability proves particularly beneficial for scenarios requiring incremental data capture and real-time replication, even when primary keys are absent from the tables intended for replication, making it an ideal solution for SQL Server to Snowflake migration scenarios.
Moreover, CDC's functionality extends seamlessly to environments utilizing Always On Availability Groups on SQL Server. The CDC data is reliably replicated to the replicas, ensuring consistent data synchronization across the entire environment. This feature further enhances CDC's utility and compatibility within complex SQL Server deployment architectures.

SQL Server Change Data Capture Limitations (Challenges with custom coding)

While the native SQL Server Change Data Capture (CDC) functionality effectively captures data changes, leveraging it for SQL replication to a Data Warehouse or Data Lake typically entails significant coding efforts. This coding involves processing and utilizing the captured changes efficiently, which can add complexity to the replication process, as highlighted in the SQL Server to Snowflake migration process.
Furthermore, beyond ensuring reliability and robustness, ongoing maintenance becomes a critical concern for enterprises utilizing CDC. Issues such as data integrity, handling new tables, and accommodating various data types require continuous attention and management, as observed in the migration journey from SQL Server to Postgres.
In addition to data management challenges, the development of alerting and monitoring solutions also necessitates manual coding efforts. However, hand-coding these solutions may not offer a quick time to value, and they can incur substantial expenses in terms of development and maintenance.
Overall, while SQL Server CDC provides valuable change capture capabilities, enterprises must contend with the complexities of coding, ongoing maintenance, and the development of monitoring solutions, which can potentially hinder the efficiency and cost-effectiveness of their data replication and migration processes.

Increase Maxscans and/or max trans for lower latency
If the workload surpasses the capabilities of CDC with default parameters or if latency becomes unacceptably high, adjusting certain parameters can help mitigate these issues. Consider increasing maxscans and/or maxtrans by a factor of 10 or reducing the polling interval to 1 to alleviate workload constraints. Monitor query performance, workload, and latency for change tables closely during these adjustments. If latency remains a concern despite these changes, further increasing maxtrans may be necessary, but continue to monitor the impact on performance, workload, and latency. Ideally, when SQL Server CDC effectively handles the workload, latency should not significantly exceed the polling interval.

Insert/Delete vs. Update
How many rows does the DML operation create per transaction?
During SQL Server Change Data Capture (CDC), each INSERT operation on a table results in the creation of one row in the change table. However, an UPDATE operation generates two rows in the change table: one representing the data before the update and another for the data after the update. If a row is immediately inserted and then updated, three rows are created in the change table. It's advisable to avoid using CDC for tables prone to frequent large updates or situations where immediate updates follow inserts.

CDC meticulously tracks INSERT, UPDATE, and DELETE operations on the database table, recording comprehensive information about these changes in a corresponding mirrored table with an identical column structure. Additional columns in the change table include.
__$start_lsn and __$end_lsn: Indicating the commit log sequence number (LSN) assigned by the SQL Server Engine to the recorded change.
__$seqval: Reflecting the order of the change relative to others within the same transaction.
__$operation: Identifying the operation type (1 = delete, 2 = insert, 3 = update before, 4 = update after).
__$update_mask: A bit mask specifying the updated columns.

This detailed information facilitates monitoring of database changes for security or auditing purposes or facilitates incremental loading of changes from the OLTP source to the target OLAP data warehouse using T-SQL or ETL methods.

Run the Cleanup job efficiently to remove deleted entries
To initiate the cleanup process for removing deleted entries in SQL Server Change Data Capture (CDC), the stored procedure sp_MScdc_cleanup_job is executed. This procedure retrieves the retention and threshold settings for the cleanup job from the cdc_jobs table in the msdb database.

The configurable threshold value specifies the maximum number of deleted entries that can be removed in a single statement. When dealing with large thresholds, there's a possibility of lock escalation, which could potentially impact application response time and increase CDC job latency. Therefore, it's recommended to schedule the cleanup process during non-peak hours or when the workload is minimal to mitigate any adverse effects on application performance.

By automating CDC cleanup tasks, organizations can effectively manage the maintenance of CDC environments, ensuring data integrity and optimal performance.

Monitor transaction log file storage
SQL Server Change Data Capture (CDC) can be impacted by the transaction log I/O subsystem. Enabling the CDC in a database significantly increases log file I/O, as the log records remain active until they are processed by the CDC. This prolonged activity can lead to latency issues, as the log space cannot be reused until CDC completes its processing. In high-latency environments, the log disk may grow and become full, which halts CDC from processing further transactions, as CDC relies on logged operations to write changes to the change tables.

To address this issue, one solution is to add another log file on a separate disk. By distributing the log files across multiple disks, the I/O workload can be effectively balanced, mitigating potential bottlenecks and reducing latency in CDC processing. This ensures a smoother operation of CDC and prevents log space exhaustion, allowing the system to efficiently manage data changes while maintaining optimal performance.
Differentiate between the application data and the change data

Typically, the location of the change table in the same filegroup as the application table doesn't significantly impact SQL Server Change Data Capture (CDC) performance. However, in scenarios with heavily loaded I/O subsystems, placing change tables in a file group on separate physical disks can enhance CDC performance.

To ensure optimal performance and maintain a small size for the PRIMARY filegroup, it's advisable to specify a different file group from PRIMARY using the @filegroup_name parameter in the sys.sp_cdc_enable_table stored procedure. This approach helps differentiate between application data and change data, facilitating efficient management and organization of CDC-related resources.

Keep the table cdc.lsn_time_mapping on a separate file-group
In addition to changing tables, it's crucial to consider the impact on the cdc.lsn_time_mapping table, which can also grow significantly due to numerous I/O operations. Whenever sys.sp_cdc_enable_db is executed on a database, this table is created in the default filegroup. To mitigate potential issues, it's recommended to ensure that change data capture metadata, particularly the cdc.lsn_time_mapping table, is stored on a file group other than PRIMARY before executing sys.sp_cdc_enable_db. Once the SQL change data capture metadata tables are created, the default file group can be reverted if necessary. This approach helps manage the size and performance of the CDC metadata tables, promoting efficient operation of the CDC environment.

HostForLIFE.eu SQL Server 2022 Hosting
HostForLIFE.eu is European Windows Hosting Provider which focuses on Windows Platform only. We deliver on-demand hosting solutions including Shared hosting, Reseller Hosting, Cloud Hosting, Dedicated Servers, and IT as a Service for companies of all sizes.

Tags: sql server hosting
Categories:
Actions: E-mail | Kick it! | Permalink | comment

Comments (0) | RSS comment feed

Comment RSS

SQL Server Hosting - HostForLIFE :: How to Fetch Daily, Weekly, Monthly, and Yearly Data from SQL?

March 14, 2024 08:39 by author

Peter

Database administration entails retrieving useful information at the appropriate moment. SQL, the database management language, includes some useful tools for this purpose. It lets you to collect data over specific time periods such as days, weeks, months, and years. In this post, we'll look at simple ways to use SQL to retrieve the data you need and enhance your SQL skills.

What are dates and times in SQL?
Before we begin, let's learn a little about dates and timings in SQL. There are several unique utilities (such as 'DATEPART', 'DATEADD', and 'DATEDIFF') that make working with time simpler. Getting comfortable with these tools will allow you to use SQL more successfully. DATEPART, DATEADD, and DATEDIFF are SQL functions that deal with date and time information. Here is a brief explanation for each.

1. DATEPART() Function
The DATEPART() function is used to extract a specific part (such as year, month, day, hour, minute, etc.) from a given date or time value.
Syntax
DATEPART(datepart, date) or DATEPART(datepart, expression)

Example
DATEPART(YEAR, '2024-01-24') returns 2024, extracting the year from the given date.

2. DATEADD() Function
The DATEADD function() is used to add or subtract a specified time interval (such as days, months, years, etc.) to a given date or time value.

Syntax
DATEADD(datepart, number, date) or DATEADD(datepart, number, expression)

Example
DATEADD(MONTH, 3, '2024-01-24')` adds 3 months to the given date, resulting in '2024-04-24'

3. DATEDIFF() Function
The DATEDIFF function() calculates the difference between two date or time values, returning the result in terms of a specified time unit (such as days, months, years, etc.).

Syntax
DATEDIFF(datepart, startdate, enddate) or DATEDIFF(datepart, startexpression, endexpression)

Example
DATEDIFF(DAY, '2024-01-01', '2024-01-24')`

It returns the number of days between the two dates.

How can we fetch data according to our specific requirements?
Let's write the query to understand how we can fetch data according to our specific requirements, be it on a DateTime basis

First, you need to create a table in SQL and insert some data on this so write this query to create and insert the data in a Registration table with column name Id, Name, Email, CreatedDate.
CREATE TABLE Registration (
    Id INT PRIMARY KEY,
    Name VARCHAR(255),
    Email VARCHAR(255),
    CreatedDate DATETIME NOT NULL
);

To insert the data in a Registration table, write this query.
INSERT INTO Registration (Id, Name, Email, CreatedDate)
VALUES
    (1, 'John Doe', '[email protected]', '2024-12-24 10:00:00'),
    (2, 'Jane Smith', '[email protected]', '2023-01-24 10:15:00'),
    (3, 'Alice Johnson', '[email protected]', '2023-12-20 10:30:00'),
    (4, 'Bob Brown', '[email protected]', '2024-01-24 10:45:00'),
    (5, 'Eva Davis', '[email protected]', '2024-11-10 11:00:00'),
    (6, 'Charlie Brown', '[email protected]', '2023-05-24 12:00:00'),
    (7, 'David Miller', '[email protected]', '2023-01-24 12:15:00'),
    (8, 'Fiona Johnson', '[email protected]', '2024-12-24 12:30:00'),
    (9, 'George Taylor', '[email protected]', '2023-01-24 12:45:00'),
    (10, 'Helen Carter', '[email protected]', '2024-01-12 13:00:00'),
    (11, 'Ian Williams', '[email protected]', '2022-05-24 13:15:00'),
    (12, 'Jack Davis', '[email protected]', '2024-01-24 13:30:00'),
    (13, 'Karen White', '[email protected]', '2024-01-24 13:45:00'),
    (14, 'Liam Robinson', '[email protected]', '2023-01-24 14:00:00'),
    (15, 'Mia Harris', '[email protected]', '2024-06-24 14:15:00'),
    (16, 'Noah Martin', '[email protected]', '2023-12-24 14:30:00'),
    (17, 'Olivia Jackson', '[email protected]', '2023-10-24 14:45:00'),
    (18, 'Peter Thomas', '[email protected]', '2024-01-24 15:00:00'),
    (19, 'Quinn Clark', '[email protected]', '2024-08-10 15:15:00'),
    (20, 'Ryan Lee', '[email protected]', '2023-12-10 15:30:00');

Instead of manually specifying the date, you can use the GETDATE() function to retrieve the current datetime. You can write like this.
INSERT INTO Registration (id, name, email, createddate)
VALUES
    (1, 'John Doe', '[email protected]', GETDATE()),
    (2, 'Jane Smith', '[email protected]', GETDATE()),
--you can insert all 20 values like this using GETDATE()function--

Fetching Daywise Data
To retrieve data daily, we can utilize the `WHERE` clause with appropriate date conditions. Learn how to structure queries to fetch data for a specific day or a range of days. Examples will demonstrate how to filter records based on exact dates or using relative date conditions.
DECLARE @Date DATETIME = '2024-01-24 13:43:35.007';
SELECT * FROM Registration WHERE CAST(CreatedDate AS DATE) = CAST(@Date AS DATE);

Retrieving Weekly Data
Understanding how to retrieve data weekly. In this query, I am fetching data from the previous week, covering the period from last Monday to Sunday.

DECLARE @Date DATETIME = '2024-01-24 13:43:35.007';
SELECT * FROM Registration
WHERE CAST(CreatedDate AS DATE) BETWEEN DATEADD(DAY, -7, DATEADD(WEEK, DATEDIFF(WEEK, 0, @Date), 0)) AND DATEADD(DAY, 6, DATEADD(DAY, -7, DATEADD(WEEK, DATEDIFF(WEEK, 0, @Date), 0)))

Fetching Monthly Data
To retrieve data monthly, this query fetches data from the previous month, covering the period from the 1st day of the last month to the 30th or 31st day.

DECLARE @Date DATETIME = '2024-01-18 13:43:35.007';
SELECT * FROM Registration
WHERE CAST(CreatedDate AS DATE) BETWEEN DATEADD(MONTH, DATEDIFF(MONTH, 0, @Date) - 1, 0) AND DATEADD(DAY, -1, DATEADD(MONTH, DATEDIFF(MONTH, 0, @Date), 0))

Retrieving Yearly Data
To retrieve data yearly, this query fetches data from the previous year, covering the period from the first month and 1st day of the last year to the last month and 30th or 31st day of the year.
DECLARE @Date DATETIME = '2024-01-22 13:43:35.007';
SELECT * FROM Registration
WHERE CAST(CreatedDate AS DATE) BETWEEN DATEFROMPARTS(YEAR(@Date) - 1, 1, 1) AND DATEFROMPARTS(YEAR(@Date), 1, 1)

Conclusion
In This article, we learned about SQL's date and time functions open up powerful capabilities for fetching data with precision and efficiency. By understanding how to manipulate dates and times, you can tailor queries to extract relevant information for specific periods, enhancing your database management skills significantly.

FAQ's
Q 1. What is the purpose of the DATEPART() function in SQL?
Ans. The DATEPART() function is used to extract specific components, such as year, month, day, hour, minute, etc., from a given date or time value in SQL.

Q 2. How does the DATEADD() function work in SQL?
Ans. The DATEADD() function adds or subtracts a specified time interval (days, months, years, etc.) to a given date or time value, allowing for easy manipulation of dates.

Q 3. What does the DATEDIFF() function do in SQL?
Ans. The DATEDIFF() function calculates the difference between two date or time values and returns the result in terms of a specified time unit (days, months, years, etc.).

Q 4. How can I retrieve data for a specific day using SQL?
Ans. To fetch data for a specific day, you can use the WHERE clause with conditions comparing the date portion of the datetime column to the desired date.
DECLARE @Date DATETIME = '2024-01-24 13:43:35.007';
SELECT * FROM Registration WHERE CAST(CreatedDate AS DATE) = CAST(@Date AS DATE);

Q 5. How can I retrieve weekly data in SQL?
Ans. You can retrieve weekly data by specifying a date range covering the desired week using the DATEADD() and DATEDIFF() functions within the WHERE clause.
DECLARE @Date DATETIME = '2024-01-24 13:43:35.007';
SELECT * FROM Registration
WHERE CAST(CreatedDate AS DATE) BETWEEN DATEADD(DAY, -7, DATEADD(WEEK, DATEDIFF(WEEK, 0, @Date), 0)) AND DATEADD(DAY, 6, DATEADD(DAY, -7, DATEADD(WEEK, DATEDIFF(WEEK, 0, @Date), 0)))

Q 6. How do I fetch monthly data in SQL?
Ans. Monthly data can be retrieved by setting a date range spanning the desired month using the DATEADD() and DATEDIFF() functions to calculate the start and end dates.

DECLARE @Date DATETIME = '2024-01-18 13:43:35.007';
SELECT * FROM Registration
WHERE CAST(CreatedDate AS DATE) BETWEEN DATEADD(MONTH, DATEDIFF(MONTH, 0, @Date) - 1, 0) AND DATEADD(DAY, -1, DATEADD(MONTH, DATEDIFF(MONTH, 0, @Date), 0))

Tags: sql server hosting
Categories:
Actions: E-mail | Kick it! | Permalink | comment

Comments (0) | RSS comment feed

Comment RSS

SQL Server Hosting - HostForLIFE :: How to Renaming the Column in SQL Server ?

March 7, 2024 06:05 by author

Peter

I recently worked on a project to assess the schema of a third-party vendor. Our organization has an internal support ticket tracking tool. The program used a SQL database, and after calculating its cost, we opted not to extend the contract. The objective was to create an in-house platform for managing internal support tickets.

My responsibility was to review the schema of the internal support database. We couldn't figure out what data was in which table because the structure was intricate and the table names were tough. Eventually, I was able to determine the relationship between tables and what data was contained in each. I've also given the columns proper names so that we can quickly identify what data is contained in which column. I used the sp_rename method to rename tables.

This article explains how to rename a column using the sp_rename stored procedure. I also demonstrated how to rename a column in SQL Server Management Studio. First, let us look at the fundamentals of renaming a column.

The Basics of Renaming Columns
Renaming a table's column is a simple task. We can use a system-stored process called sp_rename. Additionally, we can utilize SQL Server Management Studio to rename a column. The sp_rename stored procedure can rename the following:

Database objects like tables, Stored procedures, and functions.
Indexes and statistics
User-defined datatypes.

In this article, we will learn how to rename any column of an SQL Server table using the sp_rename stored procedure.

How can you rename a column in SQL Server?
In SQL Server, we may rename any column or object using the sp_rename stored procedure. In this post, we'll look at how to rename columns using the sp_rename function.

The syntax for the sp_rename stored procedure is as follows.

Exec sp_rename 'original_schema_name.original_table_name.original_column_name', 'new_column_name' ,'object_type'

In the syntax

original_schema_name.original_table_name.original_column_name: Specify the table name whose column you want to rename. If you are renaming a column of a table that exists in the user-defined schema, you must specify the table name in three three-part names.
new_column_name: Specify the new name of the column.
object_type: Specify the object type.

Let us understand the process with simple examples. Suppose you want to rename a column of the patient table. The original column name is Address, and we want to change it to patient_address. The sp_rename command to rename the column is as follows.

USE HospitalManagementSystem
GO
EXEC sys.sp_rename 'patients.address','patient_address','COLUMN'

Once the column is renamed, let us verify that the column has been renamed successfully. You can run the below query to view the columns of the patient table.
use HospitalManagementSystem
go
select table_name,column_name from information_schema.columns where table_name='Patients'

Output

As you can see in the above image, the column Address has been changed to patient_address.

Let us take another example. Suppose you want to rename the column of Sales. invoice table which exists in the Wideworldimportors database. The current name of the column is InvoiceDate, and the new name will be InvoiceCreateDate. The query to rename the column is the following.
EXEC sys.sp_rename 'Sales.Invoices.InvoiceDate','InvoiceCreateDate','COLUMN'

Here you can see, that we are changing the column name of the invoice table which is in the Sales schema. Therefore, we have used three-part naming. Once the column is renamed, execute the following T-SQL query to verify that the column has been renamed.
select table_name,column_name from information_schema.columns where table_name='Invoices'

Output

Renaming other objects in SQL Server
The sp_rename stored procedure can be used to rename other database objects, such as indexes, constraints, and stored procedures. The syntax of the sp_rename operation stays unchanged. The object_type argument for the sp_rename column will change. Let us consider a basic example.

Assume we wish to rename the index of the sales invoice table. The index's present name is 'IX_Sales_Invoices_ConfirmedDeliveryTime', which we would like to modify to 'IX_Sales_Invoices_ConfirmedDeliveryTime_New'. In the query, the object_type argument in the sp_rename method will be set to INDEX. The query to rename the index is shown below.

EXEC sys.sp_rename 'Sales.Invoices.IX_Sales_Invoices_ConfirmedDeliveryTime','IX_Sales_Invoices_ConfirmedDeliveryTime_New','INDEX'

Once the index is renamed, you can query sys. indexes dynamic management view to verify that the index has been renamed successfully. Note that whenever we rename any index, the statistics associated with the index will be renamed as well. Here is the query to verify both changes.
SELECT object_name(object_id)[TableName], name [IndexName], Type_desc [Index Type]
FROM sys.indexes where object_id=object_id('Sales.Invoices')

Output

Using SQL Server Management Studio to Rename
We can use SQL Server Management Studio to rename the database object. In the first section, we learnt how to rename columns and indexes using the sp_rename stored procedure.

In this example, we'll see how to rename a constraint in SQL Server Management Studio. For demonstration, I'll rename the constraint in the Sales.invoice table. The present constraint name is DF_Sales_Invoices_InvoiceID, which we will rename to Default_Sales_Invoices_InvoiceID. As the name implies, this constraint is a default constraint.

First, launch SQL Server Management Studio and connect to your database server. Expand databases. Expand the Wideworldimportors database.

A database contains many tables. Expand the Sales, Invoice, and Constraint tables. Press F2 or right-click on DF_Sales_Invoices_InvoiceID and choose Rename.

The name will be editable. Change the name to Default_Sales_Invoices_InvoiceID and hit enter. The name will be changed. The SQL Server management studio prompts a confirmation message that looks like the following image.

Click OK to change the name. Once changes are made, execute the following T-SQL query to verify that the constraint has been renamed successfully.
SELECT
[constraint].name AS constraint_name,
OBJECT_NAME([constraint].parent_object_id) AS table_name,
[column].name AS column_name from
sys.default_constraints [constraint]
JOIN
sys.columns [column] ON [constraint].parent_object_id = [column].object_id
AND [constraint].parent_column_id = [column].column_id
where OBJECT_NAME([constraint].parent_object_id)='Invoices'

Output

Let us take a look at some limitations and things to be considered before renaming any column.

Limitations and Considerations
If you are renaming any column in a table or renaming any object in a database, you must consider the following limitations and possible issues that might break the application.

ALTER permission is needed on the object that you want to rename. Suppose you want to rename a column name; you must have ALTER object permission on the table whose column you are renaming.
Renaming a column name always breaks the stored procedure or other objects (View, function, etc.) that are referencing that column. For example, you are renaming a column that is being used in a view. Therefore, make sure you modify all the stored procedures, functions, and triggers that reference the column that was renamed. You can use sys.sql_expression_dependencies to find all dependencies of the column.
When you rename a stored procedure, the object's name in sys.sql_modules will not change. Hence Microsoft recommends dropping and recreating an object instead of just renaming it.
When you rename a column of a table that is part of replication, the replication might break so if we want to rename the column of the replicated table, first, we must pause the replication, then rename the column using sp_rename or SQL Server management studio, update all database objects that are referencing the column, and finally, reinitialize replication with the new snapshot.

Conclusion
In this tutorial, we learned how to rename any column in a table. I demonstrated how to rename a column using a system-stored process called sp_rename, complete with syntax and easy examples. We also learned how to rename a column using SQL Server Management Studio. We can also use other tools, such as dbForge Studio for SQL Server, to run the stored procedure to rename a column. We also reviewed the limits and other difficulties that must be addressed before to renaming any object or column.

Comments (0) | RSS comment feed

Comment RSS

SQL Server Hosting - HostForLIFE :: Effective Strategies with Practical Examples

March 4, 2024 08:25 by author

Peter

SQL tuning is basic for making strides database execution, guaranteeing proficient inquiry execution, and expanding framework responsiveness. Utilizing straightforward but proficient strategies, engineers and database chairmen can incredibly improve SQL inquiry execution. In this post, we'll see at a few principal SQL tuning procedures and give viable cases to illustrate their effectiveness.

1. Use proper indexing
Indexes are required for efficient data retrieval in SQL queries. Create indexes on columns that are often used in WHERE, JOIN, and ORDER BY clauses to increase query efficiency. Let's take an example.

-- Create an index on the 'name' column of the 'users' table
CREATE INDEX idx_name ON users(name);

-- Query with indexed column 'name'
SELECT * FROM users WHERE name = 'John';

In this example, creating an index on the 'name' column of the 'users' table improves the performance of the query that filters records based on the user's name.

2. Optimize Query Structure
Well-structured SQL queries can improve performance by minimizing unnecessary processing and data retrieval. Avoid using wildcard characters excessively and optimize complex queries. Consider the following example.

-- Inefficient query with unnecessary functions and subquery
SELECT AVG(salary) FROM employees WHERE department_id IN (SELECT id FROM departments WHERE name = 'Sales');

-- Optimized query using JOIN
SELECT AVG(e.salary)
FROM employees e
JOIN departments d ON e.department_id = d.id
WHERE d.name = 'Sales';

In this example, replacing the subquery with a JOIN operation improves query readability and performance.

3. Avoid Full Table Scans
Full table scans can degrade performance, especially on large tables. Utilize indexes and WHERE clauses to limit the number of rows scanned. Consider the following example.
-- Inefficient query with full table scan
SELECT * FROM orders WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31';

-- Optimized query using an index and WHERE clause
CREATE INDEX idx_order_date ON orders(order_date);
SELECT * FROM orders WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31';

By creating an index on the 'order_date' column and using a WHERE clause, we can avoid a full table scan and improve query performance.

4. Limit Result Sets
Retrieve only the necessary data to minimize network overhead and improve query response time. Consider the following example.

-- Fetching all columns unnecessarily
SELECT * FROM products;

-- Fetching specific columns
SELECT product_id, product_name FROM products;

Limiting the columns retrieved reduces data transfer and improves query performance, especially when dealing with large tables.

5. Screen and Analyze Execution
Screen database execution estimations on a standard premise and look at inquiry execution plans to discover bottlenecks and regions for optimization. Utilize database checking devices and execution dashboards to screen inquiry execution over time.

SQL tuning may be a crucial component of database optimization, permitting undertakings to progress execution and versatility. Utilizing straightforward tuning methods such as ordering, inquiry optimization, and result set administration, engineers and chairmen can altogether make strides SQL inquiry execution. Persistent checking and investigation of database execution are required to reveal advancement openings and guarantee effective database operation. Organizations can maximize the execution of their database frameworks by taking a proactive approach to SQL optimization.

Tags: sql server hosting
Categories:
Actions: E-mail | Kick it! | Permalink | comment

Comments (0) | RSS comment feed

Comment RSS

European Windows 2012 Hosting BLOG

SQL Server Hosting - HostForLIFE :: Change Data Capture - CDC with SQL server

SQL Server Hosting - HostForLIFE :: How to Fetch Daily, Weekly, Monthly, and Yearly Data from SQL?

SQL Server Hosting - HostForLIFE :: How to Renaming the Column in SQL Server ?

SQL Server Hosting - HostForLIFE :: Effective Strategies with Practical Examples

About HostForLIFE.eu

Other Important BLOGs

Month List

Featured on

European Windows 2012 Hosting BLOG

SQL Server Hosting - HostForLIFE :: Change Data Capture - CDC with SQL server

SQL Server Hosting - HostForLIFE :: How to Fetch Daily, Weekly, Monthly, and Yearly Data from SQL?

SQL Server Hosting - HostForLIFE :: How to Renaming the Column in SQL Server ?

SQL Server Hosting - HostForLIFE :: Effective Strategies with Practical Examples

About HostForLIFE.eu

Other Important BLOGs

Month List

Tag cloud

Featured on