I was helping someone debug a slow query today when I noticed a common mistake that a lot of developers do. He was trying to get a list of users given a list of user id numbers.

The query looks something like this:

SELECT * FROM users
WHERE
  id = 1 OR
  id = 2 OR
  id = 3 OR
  id = 4 OR
  id = 5

For small data sets, SQL queries like the one above are fine. There isn’t much data for MySQL to search through and retrieve. You wouldn’t notice much of a performance hit on your servers at all.

But what if you had over a million users and you need to retrieve records from a list of 50,000 user id numbers? A better way to rewrite the query is to use the IN clause.

SELECT * FROM users
WHERE
  id IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,5000001)

You can also pass in strings and other data types:

SELECT * FROM users
WHERE
  first_name IN ('John', 'Jane', 'James', 'Jack')

You can also do NOT IN:

SELECT * FROM users
WHERE
  first_name NOT IN ('Alex', 'Abby', 'Andy', 'Audrey')

Hope this helps improve your data retrieval times. If anything, it definitely makes for easier to read queries.

Happy querying!

http://www.georgetruong.com/wp-content/plugins/sociofluid/images/digg_48.png http://www.georgetruong.com/wp-content/plugins/sociofluid/images/reddit_48.png http://www.georgetruong.com/wp-content/plugins/sociofluid/images/dzone_48.png http://www.georgetruong.com/wp-content/plugins/sociofluid/images/stumbleupon_48.png http://www.georgetruong.com/wp-content/plugins/sociofluid/images/delicious_48.png http://www.georgetruong.com/wp-content/plugins/sociofluid/images/technorati_48.png http://www.georgetruong.com/wp-content/plugins/sociofluid/images/google_48.png http://www.georgetruong.com/wp-content/plugins/sociofluid/images/myspace_48.png http://www.georgetruong.com/wp-content/plugins/sociofluid/images/facebook_48.png http://www.georgetruong.com/wp-content/plugins/sociofluid/images/twitter_48.png

4 Comments to “MySQL - Using the “IN” clause”

  • [...] MySQL – Using the “IN” clause I was helping someone debug a slow query today when I noticed a common mistake that a lot of developers do. He was trying to get a list of users given a list of user id numbers. The query looks something like this: [...]

  • Ashitkin Alexander says:

    in clause in sql? AMAZING!

  • Chris says:

    I’ve had the opposite results for queries with 20 - 100 or so PK IDs. Doing “id=1 or id=2 …” gave me a big boost in performance. [The tables in question are anywhere from 100K to 2M records].

    I guess it’s worth mentioning that benchmarking the two would be a good idea…

    Another thing to watch out for is network I/O for these queries, regardless of using “IN” or a string of “OR”s. A good way around that might be to create a tmp table, fill it and do an inner join… keeping all the data on the DB servers and not moving it to and from the web / app servers :-)

    Very nice article, btw :-)

  • Chris, thanks for the comment. But my disclaimer is that this is for beginners. :-)

    That’s really interesting. I’ve never personally experienced an instance where “id=1 or id=2…” is faster, but then again if a query is slower I usually look at other areas. That would be an interesting case to look at. I would definitely need to run some benchmarks.

    You are absolutely right about keeping all the data on the DB servers with a temporary table. If you could help it, it’s good to only send data you need to your web app.

Post comment

What I do

Expertise PHP
Internet Engineering
Build highly scalable web applications in Ruby on Rails, PHP MVC, and Python / Django.
Expertise2
Frontend Programming
Create shiny websites with CSS, (X)HTML, Javascript, AJAX, and Web 2.0 sugar.
Expertise3
LAMP Administration
Apache, MySQL, Memcache, CRON jobs, and the magic sauce that make web apps tick.
Expertise4
iPhone / OS X Development
Develop applications for iPhone and OS X in Objective-C / Cocoa for kicks and giggles.

View my portfolio »

Popular Posts

Archives

About me

My name is George. I am a Software Engineer with about three years experience developing in open-source internet technology. I currently live and work in Los Angeles, California.

I am currently a Ruby on Rails developer. But the majority of my experience has been in PHP. I have also dabbled in C# / ASP.NET and Python / Django for various projects.

View my résumé or contact me today.

Get in touch!

You may either use my contact form, or send and email to me directly at george@georgetruong.com.

You could also try to reach me at 626-817-2633.

Twitter LinkedIn Google Facebook