IT

어느 것이 더 빠르거나 가장 좋습니까?

lottoking 2020. 6. 4. 08:09
반응형

어느 것이 더 빠르거나 가장 좋습니까? SELECT * 또는 SELECT column1, colum2, column3 등


특히 필요한 열에 SELECT *더 효율적이기 때문에 SQL 명령을 작성할 때 일반적으로 사용 하는 것이 좋지 않다고 들었습니다 SELECT.

SELECT테이블의 모든 열에 필요한 경우 사용해야합니까

SELECT * FROM TABLE

또는

SELECT column1, colum2, column3, etc. FROM TABLE

이 경우 효율성이 정말로 중요합니까? SELECT *모든 데이터가 실제로 필요한 경우 내부적으로 더 최적 이라고 생각 하지만 데이터베이스에 대한 실제적인 이해없이 이것을 말하고 있습니다.

이 경우 모범 사례가 무엇인지 궁금합니다.

업데이트 : 나는 아마 내가 정말 할 수있는 유일한 상황 지정해야 합니다 을 할 수는 SELECT *내가 모든 열은 항상 검색 할 필요가 알고 어디 새로운 열이 추가 된 경우에도, 하나 개의 테이블에서 데이터를 선택하고있을 때입니다.

그러나 내가 본 응답을 감안할 때, 이것은 여전히 ​​나쁜 생각처럼 보이고 SELECT *결코 더 많은 기술적 이유로 사용해서는 안됩니다.


특정 열을 선택하는 것이 더 좋은 이유 중 하나는 SQL Server가 테이블 데이터를 쿼리하지 않고 인덱스에서 데이터에 액세스 할 가능성이 높아지기 때문입니다.

여기에 내가 쓴 게시물이 있습니다 : 선택 쿼리가 잘못된 인덱스 적용 범위를 갖는 실제 이유

데이터를 소비하는 모든 코드는 나중에 테이블 스키마의 변경 사항에 관계없이 동일한 데이터 구조를 갖기 때문에 변경하기가 덜 취약합니다.


을 감안할 때 당신의 당신이 있음을 지정 하는 모든 열을 선택, 약간의 차이가 이 시간에이 . 그러나 데이터베이스 스키마가 변경됨을 인식하십시오. 만일 당신이 사용한다면 SELECT *, 당신의 코드는 그 새로운 데이터를 사용하거나 제시 할 준비가되어 있지 않더라도, 테이블에 새로운 열을 추가하게 될 것입니다. 이는 시스템이 예기치 않은 성능 및 기능 변경에 노출되고 있음을 의미합니다.

이 비용을 약간의 비용으로 기꺼이 기각 할 수도 있지만 필요하지 않은 열은 다음과 같아야합니다.

  1. 데이터베이스에서 읽기
  2. 네트워크를 통해 전송
  3. 프로세스에 마샬링
  4. (ADO 유형 기술의 경우) 메모리 내 데이터 테이블에 저장
  5. 무시 및 폐기 / 가비지 수집

항목 # 1에는 잠재적 인 커버링 인덱스를 제거하고 데이터 페이지로드 (및 서버 캐시 스 래싱)를 유발하고 행 / 페이지 / 테이블 잠금이 발생하는 등 많은 숨겨진 비용이 발생합니다.

열을 지정하는 *잠재적 절감 효과와 유일한 절약 효과를 비교하면 다음 과 같습니다.

  1. 프로그래머는 열을 추가하기 위해 SQL을 다시 방문 할 필요가 없습니다.
  2. SQL의 네트워크 전송이 더 작거나 빠릅니다
  3. SQL Server 쿼리 구문 분석 / 유효성 검사 시간
  4. SQL Server 쿼리 계획 캐시

항목 1의 경우 실제로는 추가 할 수있는 새로운 열을 사용하기 위해 코드를 추가 / 변경한다는 것이 현실이므로 워시입니다.

항목 2의 경우, 그 차이만으로도 다른 패킷 크기 또는 수의 네트워크 패킷으로 이동할 수 없습니다. SQL 문 전송 시간이 주요 문제인 지점에 도달하면 먼저 문 비율을 줄여야합니다.

항목 3의 경우 확장 *이 어쨌든 일어나야하기 때문에 비용을 절감 할 수 없습니다 . 이는 어쨌든 테이블 스키마를 참조하는 것을 의미합니다. 실제로 열을 나열하면 스키마에 대해 유효성을 검사해야하므로 동일한 비용이 발생합니다. 다시 말해 이것은 완전한 세척입니다.

항목 4의 경우 특정 열을 지정하면 쿼리 계획 캐시가 커질 수 있지만 다른 열 집합 (지정하지 않은 열)을 처리하는 경우 에만 가능합니다. 이 경우에, 당신은 원하는 게 필요에 따라 서로 다른 계획을 원하기 때문에 다른 캐시 항목을.

따라서 질문을 지정한 방식으로 인해 최종 스키마 수정에 직면 한 문제의 복원력이 저하됩니다. 이 스키마를 ROM으로 굽는 경우 (그렇게 발생) *완벽하게 허용됩니다.

그러나 내 일반적인 지침은 필요한 열만 선택해야한다는 것입니다. 즉, 때로는 모든 열을 요구하는 것처럼 보일 수도 있지만 DBA 및 스키마 진화는 쿼리에 큰 영향을 줄 수있는 일부 새로운 열이 나타날 수 있음을 의미합니다 .

내 충고는 항상 특정 열을 선택 해야한다는 것 입니다. 반복해서하는 일에 익숙해 지므로 올바르게하는 습관을 가지십시오.

코드 변경없이 스키마가 변경 될 수있는 이유가 궁금하다면 감사 로깅, 유효 / 만료 날짜 및 DBA가 규정 준수 문제에 대해 체계적으로 추가하는 기타 유사한 사항을 고려하십시오. 미처리 변경의 또 다른 원인은 시스템 또는 사용자 정의 필드의 다른 곳에서 성능을 저하시키는 것입니다.


필요한 열만 선택해야합니다. 모든 열이 필요하더라도 SQL Server가 열에 대한 시스템 테이블을 쿼리하지 않아도되도록 열 이름을 나열하는 것이 좋습니다.

또한 누군가가 테이블에 열을 추가하면 응용 프로그램이 중단 될 수 있습니다. 프로그램은 예상하지 못한 열을 가져오고 처리 방법을 모를 수 있습니다.

이 외에도 테이블에 이진 열이 있으면 쿼리 속도가 훨씬 느려지고 더 많은 네트워크 리소스가 사용됩니다.


select *나쁜 것에 네 가지 큰 이유가 있습니다 .

  1. 가장 중요한 실질적인 이유는 사용자가 열이 반환되는 순서를 마술로 알도록 강요하기 때문입니다. 명시 적으로하는 것이 낫습니다. 테이블 변경에 대해 보호합니다.

  2. If a column name you're using changes, it's better to catch it early (at the point of the SQL call) rather than when you're trying to use the column that no longer exists (or has had its name changed, etc.)

  3. Listing the column names makes your code far more self-documented, and so probably more readable.

  4. If you're transferring over a network (or even if you aren't), columns you don't need are just waste.


Specifying the column list is usually the best option because your application won't be affected if someone adds/inserts a column to the table.


Specifying column names is definitely faster - for the server. But if

  1. performance is not a big issue (for example, this is a website content database with hundreds, maybe thousands - but not millions - of rows in each table); AND
  2. your job is to create many small, similar applications (e.g. public-facing content-managed websites) using a common framework, rather than creating a complex one-off application; AND
  3. flexibility is important (lots of customization of the db schema for each site);

then you're better off sticking with SELECT *. In our framework, heavy use of SELECT * allows us to introduce a new website managed content field to a table, giving it all of the benefits of the CMS (versioning, workflow/approvals, etc.), while only touching the code at a couple of points, instead of a couple dozen points.

I know the DB gurus are going to hate me for this - go ahead, vote me down - but in my world, developer time is scarce and CPU cycles are abundant, so I adjust accordingly what I conserve and what I waste.


SELECT * is a bad practice even if the query is not sent over a network.

  1. Selecting more data than you need makes the query less efficient - the server has to read and transfer extra data, so it takes time and creates unnecessary load on the system (not only the network, as others mentioned, but also disk, CPU etc.). Additionally, the server is unable to optimize the query as well as it might (for example, use covering index for the query).
  2. After some time your table structure might change, so SELECT * will return a different set of columns. So, your application might get a dataset of unexpected structure and break somewhere downstream. Explicitly stating the columns guarantees that you either get a dataset of known structure, or get a clear error on the database level (like 'column not found').

Of course, all this doesn't matter much for a small and simple system.


Performance wise, SELECT with specific columns can be faster (no need to read in all the data). If your query really does use ALL the columns, SELECT with explicit parameters is still preferred. Any speed difference will be basically unnoticeable and near constant-time. One day your schema will change, and this is good insurance to prevent problems due to this.


You should really be selecting only the fields you need, and only the required number, i.e.

SELECT Field1, Field2 FROM SomeTable WHERE --(constraints)

Outside of the database, dynamic queries run the risk of injection attacks and malformed data. Typically you get round this using stored procedures or parameterised queries. Also (although not really that much of a problem) the server has to generate an execution plan each time a dynamic query is executed.


Lots of good reasons answered here so far, here's another one that hasn't been mentioned.

Explicitly naming the columns will help you with maintenance down the road. At some point you're going to be making changes or troubleshooting, and find yourself asking "where the heck is that column used".

If you've got the names listed explicitly, then finding every reference to that column -- through all your stored procedures, views, etc -- is simple. Just dump a CREATE script for your DB schema, and text search through it.


definitely defining the columns, because SQL Server will not have to do a lookup on the columns to pull them. If you define the columns, then SQL can skip that step.


It's always better to specify the columns you need, if you think about it one time, SQL doesn't have to think "wtf is *" every time you query. On top of that, someone later may add columns to the table that you actually do not need in your query and you'll be better off in that case by specifying all of your columns.


The problem with "select *" is the possibility of bringing data you don't really need. During the actual database query, the selected columns don't really add to the computation. What's really "heavy" is the data transport back to your client, and any column that you don't really need is just wasting network bandwidth and adding to the time you're waiting for you query to return.

Even if you do use all the columns brought from a "select *...", that's just for now. If in the future you change the table/view layout and add more columns, you'll start bring those in your selects even if you don't need them.

Another point in which a "select *" statement is bad is on view creation. If you create a view using "select *" and later add columns to your table, the view definition and the data returned won't match, and you'll need to recompile your views in order for them to work again.

I know that writing a "select *" is tempting, 'cause I really don't like to manually specify all the fields on my queries, but when your system start to evolve, you'll see that it's worth to spend this extra time/effort in specifying the fields rather than spending much more time and effort removing bugs on your views or optimizing your app.


While explicitly listing columns is good for performance, don't get crazy.

So if you use all the data, try SELECT * for simplicity (imagine having many columns and doing a JOIN... query may get awful). Then - measure. Compare with query with column names listed explicitly.

Don't speculate about performance, measure it!

Explicit listing helps most when you have some column containing big data (like body of a post or article), and don't need it in given query. Then by not returning it in your answer DB server can save time, bandwidth, and disk throughput. Your query result will also be smaller, which is good for any query cache.


Select is equally efficient (in terms of velocity) if you use * or columns.

The difference is about memory, not velocity. When you select several columns SQL Server must allocate memory space to serve you the query, including all data for all the columns that you've requested, even if you're only using one of them.

What does matter in terms of performance is the excecution plan which in turn depends heavily on your WHERE clause and the number of JOIN, OUTER JOIN, etc ...

For your question just use SELECT *. If you need all the columns there's no performance difference.


It is NOT faster to use explicit field names versus *, if and only if, you need to get the data for all fields.

Your client software shouldn't depend on the order of the fields returned, so that's a nonsense too.

And it's possible (though unlikely) that you need to get all fields using * because you don't yet know what fields exist (think very dynamic database structure).

Another disadvantage of using explicit field names is that if there are many of them and they're long then it makes reading the code and/or the query log more difficult.

So the rule should be: if you need all the fields, use *, if you need only a subset, name them explicitly.


The result is too huge. It is slow to generate and send the result from the SQL engine to the client.

The client side, being a generic programming environment, is not and should not be designed to filter and process the results (e.g. the WHERE clause, ORDER clause), as the number of rows can be huge (e.g. tens of millions of rows).


Naming each column you expect to get in your application also ensures your application won't break if someone alters the table, as long as your columns are still present (in any order).


It depends on the version of your DB server, but modern versions of SQL can cache the plan either way. I'd say go with whatever is most maintainable with your data access code.


One reason it's better practice to spell out exactly which columns you want is because of possible future changes in the table structure.

If you are reading in data manually using an index based approach to populate a data structure with the results of your query, then in the future when you add/remove a column you will have headaches trying to figure out what went wrong.

As to what is faster, I'll defer to others for their expertise.


As with most problems, it depends on what you want to achieve. If you want to create a db grid that will allow all columns in any table, then "Select *" is the answer. However, if you will only need certain columns and adding or deleting columns from the query is done infrequently, then specify them individually.

It also depends on the amount of data you want to transfer from the server. If one of the columns is a defined as memo, graphic, blob, etc. and you don't need that column, you'd better not use "Select *" or you'll get a whole bunch of data you don't want and your performance could suffer.


To add on to what everyone else has said, if all of your columns that you are selecting are included in an index, your result set will be pulled from the index instead of looking up additional data from SQL.


SELECT * is necessary if one wants to obtain metadata such as the number of columns.


What everyone above said, plus:

If you're striving for readable maintainable code, doing something like:

SELECT foo, bar FROM widgets;

is instantly readable and shows intent. If you make that call you know what you're getting back. If widgets only has foo and bar columns, then selecting * means you still have to think about what you're getting back, confirm the order is mapped correctly, etc. However, if widgets has more columns but you're only interested in foo and bar, then your code gets messy when you query for a wildcard and then only use some of what's returned.


And remember if you have an inner join by definition you do not need all the columns as the data in the join columns is repeated.

It's not like listing columns in SQl server is hard or even time-consuming. You just drag them over from the object browser (you can get all in one go by dragging from the word columns). To put a permanent performance hit on your system (becasue this can reduce the use of indexes and becasue sending unneeded data over the network is costly) and make it more likely that you will have unexpected problems as the database changes (sometimes columns get added that you do not want the user to see for instance) just to save less than a minute of development time is short-sighted and unprofessional.


Absolutely define the columns you want to SELECT every time. There is no reason not to and the performance improvement is well worth it.

They should never have given the option to "SELECT *"


If you need every column then just use SELECT * but remember that the order could potentially change so when you are consuming the results access them by name and not by index.

I would ignore comments about how * needs to go get the list - chances are parsing and validating named columns is equal to the processing time if not more. Don't prematurely optimize ;-)


In terms of execution efficiency I am not aware of any significant difference. But for programmers efficiency I would write the names of the fields because

  • You know the order if you need to index by number, or if your driver behaves funny on blob-values, and you need a definite order
  • You only read the fields you need, if you should ever add more fields
  • You get an sql-error if you misspell or rename a field, not an empty value from a recordset/row
  • You can better read what's going on.

hey, be practical. use select * when prototyping and select specific columns when implementing and deploying. from an execution plan perspective, both are relatively identical on modern systems. however, selecting specific columns limits the amount of data that has to be retrieved from disk, stored in memory and sent over the network.

ultimately the best plan is to select specific columns.


Also keep changes in mind. Today, Select * only selects the columns that you need, but tomorrow it may also select that varbinary(MAX) column that i've just added without telling you, and you are now also retreiving all 3.18 Gigabytes of Binary Data that wasn't in the table yesterday.

참고URL : https://stackoverflow.com/questions/65512/which-is-faster-best-select-or-select-column1-colum2-column3-etc

반응형