모범 사례가없는 경우 SQL Server 삽입
나는이 Competitions
보유하고 결과 테이블 팀 구성원의 이름 과 한 손에 순위를.
반면에 나는 유일한 경쟁자 이름 의 테이블 을 유지해야 합니다 .
CREATE TABLE Competitors (cName nvarchar(64) primary key)
이제 첫 번째 테이블에 약 200,000 개의 결과가 있으며 경쟁자 테이블이 비어 있으면 이것을 수행 할 수 있습니다.
INSERT INTO Competitors SELECT DISTINCT Name FROM CompResults
그리고 쿼리는 약 11,000 개의 이름을 삽입하는 데 5 초 정도 걸립니다.
지금까지 이것은 중요한 응용 프로그램이 아니므 로 약 10,000 행으로 새로운 경쟁 결과를받을 때 한 달에 한 번 경쟁사 테이블을 자르는 것을 고려할 수 있습니다 .
그러나 새로운 경쟁 업체와 기존 경쟁 업체에서 새로운 결과를 추가 할 때 가장 좋은 방법은 무엇입니까? 기존 경쟁사 테이블을 자르고 싶지 않습니다.
새로운 경쟁사에 대해서만 INSERT 문을 수행해야하며 존재하는 경우 아무 것도하지 않아도됩니다.
의미 적으로 "아직 존재하지 않는 경쟁 업체 삽입"을 요청합니다.
INSERT Competitors (cName)
SELECT DISTINCT Name
FROM CompResults cr
WHERE
NOT EXISTS (SELECT * FROM Competitors c
WHERE cr.Name = c.cName)
또 다른 옵션은 결과 테이블을 기존 경쟁자 테이블과 결합하고 조인과 일치하지 않는 고유 한 레코드를 필터링하여 새 경쟁자를 찾는 것입니다.
INSERT Competitors (cName)
SELECT DISTINCT cr.Name
FROM CompResults cr left join
Competitors c on cr.Name = c.cName
where c.cName is null
새로운 구문 MERGE 는 또한 작고 우아하며 효율적인 방법을 제공합니다.
MERGE INTO Competitors AS Target
USING (SELECT DISTINCT Name FROM CompResults) AS Source ON Target.Name = Source.Name
WHEN NOT MATCHED THEN
INSERT (Name) VALUES (Source.Name);
왜 다른 사람이 아직이 말을하지 않았는지 모릅니다.
정상화.
경쟁을 모델로 한 테이블이 있습니까? 경쟁은 경쟁자로 구성되어 있습니까? 하나 이상의 대회에서 다른 선수 목록이 필요합니다 ......
다음 테이블이 있어야합니다 .....
CREATE TABLE Competitor (
[CompetitorID] INT IDENTITY(1,1) PRIMARY KEY
, [CompetitorName] NVARCHAR(255)
)
CREATE TABLE Competition (
[CompetitionID] INT IDENTITY(1,1) PRIMARY KEY
, [CompetitionName] NVARCHAR(255)
)
CREATE TABLE CompetitionCompetitors (
[CompetitionID] INT
, [CompetitorID] INT
, [Score] INT
, PRIMARY KEY (
[CompetitionID]
, [CompetitorID]
)
)
다른 테이블을 가리키는 CompetitionCompetitors.CompetitionID 및 CompetitorID에 대한 제한 조건이 있습니다.
이러한 종류의 테이블 구조 (키는 모두 간단한 INTS입니다)-모델에 맞는 좋은 자연 키가없는 것처럼 보이므로 SURROGATE 키가 여기에 적합하다고 생각합니다.
따라서 특정 경쟁에서 다른 경쟁 업체 목록을 얻으려면 다음과 같이 쿼리를 발행하십시오.
DECLARE @CompetitionName VARCHAR(50) SET @CompetitionName = 'London Marathon'
SELECT
p.[CompetitorName] AS [CompetitorName]
FROM
Competitor AS p
WHERE
EXISTS (
SELECT 1
FROM
CompetitionCompetitor AS cc
JOIN Competition AS c ON c.[ID] = cc.[CompetitionID]
WHERE
cc.[CompetitorID] = p.[CompetitorID]
AND cc.[CompetitionName] = @CompetitionNAme
)
그리고 당신은 각 경쟁에 대한 점수를 원한다면 경쟁자가 있습니다 :
SELECT
p.[CompetitorName]
, c.[CompetitionName]
, cc.[Score]
FROM
Competitor AS p
JOIN CompetitionCompetitor AS cc ON cc.[CompetitorID] = p.[CompetitorID]
JOIN Competition AS c ON c.[ID] = cc.[CompetitionID]
And when you have a new competition with new competitors then you simply check which ones already exist in the Competitors table. If they already exist then you don't insert into Competitor for those Competitors and do insert for the new ones.
Then you insert the new Competition in Competition and finally you just make all the links in CompetitionCompetitors.
You will need to join the tables together and get a list of unique competitors that don't already exist in Competitors
.
This will insert unique records.
INSERT Competitors (cName)
SELECT DISTINCT Name
FROM CompResults cr LEFT JOIN Competitors c ON cr.Name = c.cName
WHERE c.Name IS NULL
There may come a time when this insert needs to be done quickly without being able to wait for the selection of unique names. In that case, you could insert the unique names into a temporary table, and then use that temporary table to insert into your real table. This works well because all the processing happens at the time you are inserting into a temporary table, so it doesn't affect your real table. Then when you have all the processing finished, you do a quick insert into the real table. I might even wrap the last part, where you insert into the real table, inside a transaction.
Normalizing your operational tables as suggested by Transact Charlie, is a good idea, and will save many headaches and problems over time - but there are such things as interface tables, which support integration with external systems, and reporting tables, which support things like analytical processing; and those types of tables should not necessarily be normalized - in fact, very often it is much, much more convenient and performant for them to not be.
In this case, I think Transact Charlie's proposal for your operational tables is a good one.
But I would add an index (not necessarily unique) to CompetitorName in the Competitors table to support efficient joins on CompetitorName for the purposes of integration (loading of data from external sources), and I would put an interface table into the mix: CompetitionResults.
CompetitionResults should contain whatever data your competition results have in it. The point of an interface table like this one is to make it as quick and easy as possible to truncate and reload it from an Excel sheet or a CSV file, or whatever form you have that data in.
That interface table should not be considered part of the normalized set of operational tables. Then you can join with CompetitionResults as suggested by Richard, to insert records into Competitors that don't already exist, and update the ones that do (for example if you actually have more information about competitors, like their phone number or email address).
One thing I would note - in reality, Competitor Name, it seems to me, is very unlikely to be unique in your data. In 200,000 competitors, you may very well have 2 or more David Smiths, for example. So I would recommend that you collect more information from competitors, such as their phone number or an email address, or something which is more likely to be unique.
Your operational table, Competitors, should just have one column for each data item that contributes to a composite natural key; for example it should have one column for a primary email address. But the interface table should have a slot for old and new values for a primary email address, so that the old value can be use to look up the record in Competitors and update that part of it to the new value.
So CompetitionResults should have some "old" and "new" fields - oldEmail, newEmail, oldPhone, newPhone, etc. That way you can form a composite key, in Competitors, from CompetitorName, Email, and Phone.
Then when you have some competition results, you can truncate and reload your CompetitionResults table from your excel sheet or whatever you have, and run a single, efficient insert to insert all the new competitors into the Competitors table, and single, efficient update to update all the information about the existing competitors from the CompetitionResults. And you can do a single insert to insert new rows into the CompetitionCompetitors table. These things can be done in a ProcessCompetitionResults stored procedure, which could be executed after loading the CompetitionResults table.
That's a sort of rudimentary description of what I've seen done over and over in the real world with Oracle Applications, SAP, PeopleSoft, and a laundry list of other enterprise software suites.
One last comment I'd make is one I've made before on SO: If you create a foreign key that insures that a Competitor exists in the Competitors table before you can add a row with that Competitor in it to CompetitionCompetitors, make sure that foreign key is set to cascade updates and deletes. That way if you need to delete a competitor, you can do it and all the rows associated with that competitor will get automatically deleted. Otherwise, by default, the foreign key will require you to delete all the related rows out of CompetitionCompetitors before it will let you delete a Competitor.
(Some people think non-cascading foreign keys are a good safety precaution, but my experience is that they're just a freaking pain in the butt that are more often than not simply a result of an oversight and they create a bunch of make work for DBA's. Dealing with people accidentally deleting stuff is why you have things like "are you sure" dialogs and various types of regular backups and redundant data sources. It's far, far more common to actually want to delete a competitor, whose data is all messed up for example, than it is to accidentally delete one and then go "Oh no! I didn't mean to do that! And now I don't have their competition results! Aaaahh!" The latter is certainly common enough, so, you do need to be prepared for it, but the former is far more common, so the easiest and best way to prepare for the former, imo, is to just make foreign keys cascade updates and deletes.)
The answers above which talk about normalizing are great! But what if you find yourself in a position like me where you're not allowed to touch the database schema or structure as it stands? Eg, the DBA's are 'gods' and all suggested revisions go to /dev/null?
In that respect, I feel like this has been answered with this Stack Overflow posting too in regards to all the users above giving code samples.
I'm reposting the code from INSERT VALUES WHERE NOT EXISTS which helped me the most since I can't alter any underlying database tables:
INSERT INTO #table1 (Id, guidd, TimeAdded, ExtraData)
SELECT Id, guidd, TimeAdded, ExtraData
FROM #table2
WHERE NOT EXISTS (Select Id, guidd From #table1 WHERE #table1.id = #table2.id)
-----------------------------------
MERGE #table1 as [Target]
USING (select Id, guidd, TimeAdded, ExtraData from #table2) as [Source]
(id, guidd, TimeAdded, ExtraData)
on [Target].id =[Source].id
WHEN NOT MATCHED THEN
INSERT (id, guidd, TimeAdded, ExtraData)
VALUES ([Source].id, [Source].guidd, [Source].TimeAdded, [Source].ExtraData);
------------------------------
INSERT INTO #table1 (id, guidd, TimeAdded, ExtraData)
SELECT id, guidd, TimeAdded, ExtraData from #table2
EXCEPT
SELECT id, guidd, TimeAdded, ExtraData from #table1
------------------------------
INSERT INTO #table1 (id, guidd, TimeAdded, ExtraData)
SELECT #table2.id, #table2.guidd, #table2.TimeAdded, #table2.ExtraData
FROM #table2
LEFT JOIN #table1 on #table1.id = #table2.id
WHERE #table1.id is null
The above code uses different fields than what you have, but you get the general gist with the various techniques.
Note that as per the original answer on Stack Overflow, this code was copied from here.
Anyway my point is "best practice" often comes down to what you can and can't do as well as theory.
- If you're able to normalize and generate indexes/keys -- great!
- If not and you have the resort to code hacks like me, hopefully the above helps.
Good luck!
Ok, this was asked 7 years ago, but I think the best solution here is to forego the new table entirely and just do this as a custom view. That way you're not duplicating data, there's no worry about unique data, and it doesn't touch the actual database structure. Something like this:
CREATE VIEW vw_competitions
AS
SELECT
Id int
CompetitionName nvarchar(75)
CompetitionType nvarchar(50)
OtherField1 int
OtherField2 nvarchar(64) --add the fields you want viewed from the Competition table
FROM Competitions
GO
Other items can be added here like joins on other tables, WHERE clauses, etc. This is most likely the most elegant solution to this problem, as you now can just query the view:
SELECT *
FROM vw_competitions
...and add any WHERE, IN, or EXISTS clauses to the view query.
참고URL : https://stackoverflow.com/questions/5288283/sql-server-insert-if-not-exists-best-practice
'IT' 카테고리의 다른 글
lodash를 사용하여 목록에서 요소를 제거하는 방법은 무엇입니까? (0) | 2020.06.13 |
---|---|
술어와 일치하는 순서로 첫 번째 요소 찾기 (0) | 2020.06.13 |
다양한 깊이의 중첩 된 사전 값 업데이트 (0) | 2020.06.13 |
루프 나 조건문없이 1부터 1000까지 인쇄하는 C 코드는 어떻게 작동합니까? (0) | 2020.06.13 |
동일한 네트워크의 다른 컴퓨터에서이 로컬 호스트에 어떻게 연결합니까? (0) | 2020.06.13 |