When inserting string into SQL Server table using the INSERT predicate that uses string concatenation there seems to be an issue with in SQL Server 2005. There seems to be an upper limit of about 480+ concaenations before the insert fails and you get following error messsge:
Msg 191, Level 15, State 1, Line 1
Some part of your SQL statement is nested too deeply. Rewrite the query or break it up into smaller queries.
What does that mean? I mean it was a simple insert statement no loops just two brackets; well it seems to be a bug and has been fixed in next major release of SQL Server.
To produce the error, create a new database and try executing the following SQL Statement:
... to create test table ...CREATE TABLE [dbo].[T1](
[ID] [int] IDENTITY(1,1) NOT NULL,
[TValue] [varchar](MAX) NULL,
CONSTRAINT [PK_T1] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
... test insert statements ...-- Good SQL StatementPlease click here for full statement.
INSERT INTO T1 VALUES ('A' + ... + 'A')
GO
-- Bad SQL Statement
INSERT INTO T1 VALUES ('A' + ... + 'A' + 'EXTRA')
GO
Only difference between the good statement and bad statement? The bad statement has ONE extra concatenation.
Microsoft Connect Article, Link.
Wednesday, October 21, 2009
Error 191: Some part of your SQL statement is nested too deeply.
Why backup testing is important?
How do you do backup testing? In normal cases we backup data from SQL Server (either using SQL Server Agent, or another 3rd party backup agent) to Disk and then to Tape or Directly to tape. We setup alerts on backups to make sure the backup job completed successfully or failed? What else we can do, we can implement verification test after backups to make sure backups are valid. Is that really enough?
What other things should be considered? Every part of the backup and restore process needs to be tested to confirm you have good backups. It should be part of policy to test this as often as possible with minimum at quarterly.
** Assuming using SQL Server Agent Backups **.
1) Confirm backups (Email alerts, manual checks).
2) Confirm backups are going to Tape directly or are being picked up from Operating System backup (Email alerts, manual checks).
3) Request a restore from Tape (make sure it is from off-site location and not Virtual Tape Library (VTL)).
4) Run Verification on the backup to make sure no issues with tape drive and backup file is still valid).
5) Restore the database in a testing environment.
6) Check security and GET END user to try it out.
7) Verify ALL infrastructure items are there, configured, and working (tape backup, tape agent, sql backup, security, disk space, disk availability, etc.)
So Step 1-2 are normal; but that is not where backup confirmation ends. You should occasionally request restore, verify it and restore it in testing environment. Step 6 is seem a bit strange but it just as much part of restore process, I restore the database I am done but if no one can still connect? Then have I really recovered from disaster? Step 7 is explained later....
So why am I writing about this? This is basic DBA duties that should be hard-coded into our brains. Well I thought it was hard coded in my brain but sometimes because of other project constraints, time, money, etc. Process is skipped well in one such case this happened to me.
I configured SQL Server, all standard jobs; disk backups and email alerts. Everything works nice I did not get any errors. Backups were going off server that has been production for a while setup before my time on the team. So I assume everything was correct as we have been using this dump location for a while. This was new server so I didn't get chance to run verification steps listed above after the server was up, assuming things are going well, I got busy on other tasks. Well that was bad decision on my part, because I ASSUMED! things were configured when they were not. This is were #7 comes into play, I assumed and was left in an awe when I was told this dump location doesn't have any tape agent installed on it.
I got contacted by client, because of a bug in the application it allowed the client to delete their ENTIRE WEBSITE (Content Managed System) with SQL Server Backend. I found out 1 day after the fact so I did not have backups on disk any more. I contact the Storage guys to ask for restore and found out they DON'T HAVE ANY tape backups for my dump location. So I quickly made copy of what Transaction Log Backups, Data backups I hand on hand to see if I can get some kind of restores. But found the transaction log backups needed the full backup that has been over written already.
Since this was a complete data lose I had to turn to 3rd party vendor tools for data-recovery in non-standard manner. I used the ApexSQL Log tool to read the transaction log to recovery engineer all the records that were deleted. However using this tool found few annoyances that you'll have to watch out for:
1) The tool will no generate any recovery script for binary type objects. In file format or other wise.
2) If you have large varchar columns or such the system will try to be creative and change the Enter/Carriage Return to Char(10) + Char(13) string concatenation. Problem with that is for very large varchar columns SQL Server could not prase the SQL String successfully any more, for more please read here.
3) In default settings the code does not generate the GUIDS, so if your database heavily relies on this this setting has be turned off before reverse engineering.
4) While the tool is reading transaction log it chugs through alot of memory, for example reverse engineering, transaction logs with over 1M transactions with 1.5GB ram program crashed on me every time. So I processed the transaction logs in small chucks.
5) I found the tool to be extremely slow when selecting/unselecting and the filter functionality to be limiting.
So aside from these limitations this tool still saved me, it couldn't recovery enough to get site working. But it recovered enough that CMS product vendor was able to recreate the pages because of the redundant nature of the data.
Thursday, September 17, 2009
Unique Stop List for Single Database in SQL 2005
Unlike SQL 2008, in SQL 2005 we cannot define a unique stop list (Noise Word List) for each database. As this is controlled by the Language settings on Full Text Index defined in the catalog.
So lets say I want the stop list to only include words from following list:
a, an, the, I, am, us
Rest of the words must be indexed as per the requirements. Setting up a new instance for this one database was out of scope for budget and modifying the noise world list on SQL Server affected more then a single database. I was able to meet the requirements by manually rebuilding the index with new stop list then changing the list back. But I did not want to do this every month manually; so I decided to create SQL Server Job for this. Before job could be created following must be done:
- Confirm the time Job should run, it cannot overlap with any other auto-populations for full text index.
- The full text catalog in question cannot have auto-population schedule.
- You will have to enable xp_cmdshell Extended stored procedure on the server.
- Create two new text files, one called SQL_noiseNEU.txt which is copy of noiseNEU.txt in MSSQL.1\MSSQL\FTData\ and another text file called Simple_noiseNEU.txt with my stop list.
Step 1 (Executed at 6AM):
Replace the noiseNEU.txt in MSSQL.1\MSSQL\FTData\noiseNEU.txt with Simple_noiseNEU.txt.
Step 2 (Executed at 6:15AM):
Rebuild the catalog for the database. Please note when you change the noise word list you have to rebuild the entire catalog you cannot just rebuild one index.
Step 3 (Executed at 6:30AM):
Replace the noiseNEU.txt in MSSQL.1\MSSQL\FTData\noiseNEU.txt with SQL_noiseNEU.txt.
SQL Job Script Generated from SQL Server: Link.
SQL Server was unable to log on as DOMAIN\ServiceAccount
One of our virtual server decided to reboot last night the OS was able to recover without issues. But for some reason SQL Server refused to work; I looked to event viwer, I see following error message logged in Security Event Log:
Event Type: Error
Event Source: Service Control Manager
Event Category: None
Event ID: 7000Date: 9/17/2009
Time: 10:06:56 AM
User: N/A
Computer: ServerName
Description: The SQL Server (MSSQLSERVER) service failed to start due to the following error: The service did not start due to a logon failure.
So my first suspision was password was wrong, but Service has been running for a while successfully so what caused it to stop working? In any case we tried re-applying security credintials using SQL Server Configuration Manager to no avail. So digging through the Event Logs again I ran by this message:
Event Type: Error
Event Source: Service Control Manager
Event Category: None
Event ID: 7041
Date: 9/17/2009
Time: 10:06:56 AM
User: N/A
Computer: ServerName
Description:The MSSQLSERVER service was unable to log on as DOMAIN\ServiceAccount with the currently configured password due to the following error:
Logon failure: the user has not been granted the requested logon type at this computer.
Service: MSSQLSERVER
Domain and account: DOMAIN\ServiceAccount
This service account does not have the necessary user right "Log on as a service." User Action Assign "Log on as a service" to the service account on this computer.
I was suprised to see the eroror because service account has been running as service for a while; I am not sure how the account got removed from "Log on as service" group policies. But I launched gpedit.msc and went to ...
Computer Configuration -> Windows Settings -> Security Settings -> Local Policies -> User Rights Assignment -> Log on as service
I tried to add the service account but option was disabled; talking to Windows Services they cannot explain why it is disabled.
With SQL Server 2005 everytime we are configuring services it is recommend to use "SQL Server Configuration Manager" because it does various system security configuration for registery and file system that will be missed if using the regular Service Manager, but it seems when setting service account in this utility it does not add the account to the group policy. I used regular Service Manager console (services.msc) to reapply the service account.
This time I get a message "Account has been added to Log On as Service".
Restarted SQL Services, things working successfully now.
Problem: SQL Server 2005 failed to start because service account does not have access to Log On As service in group policies.
Solution: Apply the security settings using the Service Management console (services.msc); using the SQL Server Configuration Manager does not seem to add the account to group policies.
Sunday, July 12, 2009
Windows 2003 Cluster: SQL Node Not able to Join Cluster after Changing MaxTokenSize
As per Cannot Generate SSPI Context article, Link, I had changed the MaxTokenSize to fix the issue. I had read various articles before jumping into this change as being the only proper solution to the problem.
However after changing the setting and rebooting the passive node the node refused to join the cluster; looking at the error logs I have messages like …
- Unable to get join version data from sponsor xxx.xxx.xxx.xxx using NTLM package, status 5.
- Unable to connect to any sponsor node.
- Failed to join cluster, status 53.
- Physical Disk
: [DiskArb] Failed to read (sector 12), error 170.
I started reading through the log file and started to pick through error messages and looked and articles [2], [3], [5], [6], and [7]. I verified all the settings each article suggested, Group Policies, Security, firewall, etc. None of it seems to help in making sure the passive node would join the cluster successfully.
I started trouble ticket with Microsoft; they found few settings, on of being NTML Compatibility Level that needed changing as part of their troubleshooting but even after changing these settings we were still getting NTLM, state 5 error messages in Cluster.log file (State 5 means permissions denied). Talking to Microsoft they referenced few more KB articles in addition to what I had found already like [8], [9], [10], & [11] that indicated what “might” be the issue; but none of them seem to help resolve our issue.
This whole time we had not rebooted the active node as it was working successfully; but since we were hitting stone-wall every turn we decided to further troubleshoot the issue node 1 (active node) must be restarted because the errors that were being generated on Quorum disk. After rebooting Active Node, the Passive Node came active and Clustering was working successfully.
I had not read any KB article indicating the issue with MaxTokenSize and Windows Clustering, and neither had the Microsoft guys. So talking to the Kerberos experts we figured issue was similar to [7], in which if you change password or the password length is less then 15 characters of the Cluster Services account permissions or security settings are not properly hashed and generates errors when authenticating the new node to the cluster.
So if you are changing the MaxTokenSize setting on SQL Server and it is a cluster please make sure you change it on EVERY NODE; or you will have lots of strange issues that probably shouldn’t exist.
Reference Links:
[1] KB886717 Issue with Cluster Log file, Link.
[2] Problems with Microsoft Clusters, Link.
[3] How to manually re-create the Cluster service account, Link.
[4] A Windows Server 2003 based-computer that is running the Cluster service may be unable to join a cluster after the computer is first restarted, Link.
[5] Cluster Service May Not Start After You Restrict Available IP Ports for Remote Procedure Call, Link.
[6] Ask Core!, Troubleshooting Cluster Logs 101 Why did the resource failover to other node?, Link.
[7] Cluster service account password must be set to 15 or more characters if the NoLMHash policy is enabled, Link.
[8] You cannot add an additional node to a Windows Server 2003-based server cluster, and error code "0x8007042b" is logged in the ClCfgSrv.log file, Link.
[9] You receive an "Error 0x8007042b" error message when you add or join a node to a cluster if you use NTLM version 2 in Windows Server 2003, Link.
[10] How to enable NTLM 2 authentication, Link.
[11] Cluster service does not start on joining node in Windows 2000 Cluster, Link.
About Me
- Mohit K. Gupta
- I have a B.Sc. Computer Science with Minor in Japanese. I am currently working on my Masters in Information Systems. I spend most of my time on studies for research or for SQL Server related issues. My main interests revolve around Japan, to that regards, I practice Kendo (The Way of the Sword) and have been doing it for now about 5 Years. In addition I like to play Go (or Igo in Japanese) board game like chess, but maybe a bit more complicated. I am hoping to go to Japan in the future to further develop my Japanese, Go and Kendo skills.