'Rails SQL "select in" across several columns: where (code1, code2) in (("A", 1), ("A", 3), ("Q", 9))

I have a business requirement to select records based on two fields in one table: code1 and code2. The selection is complex and hard-coded, with no codeable rhyme or reason, and includes about a dozen pairs, out of the hundred pairs that actually exist in the table.

  • C, 1
  • C, 2
  • J, 9
  • Z, 0

Note that there are other "C" codes in the table, such as (C, 3). There is no combined field that captures them both as a value, e.g, "C3".

SQL supports a query like this: Two columns in subquery in where clause e.g.

SELECT * from rejection_codes
  where (code1, code2) in (("A", 1), ("A", 3), ("Q", 9))

Is there a way to do this with Rails and ActiveRecord's ORM, without resorting to raw SQL?

I'm running Rails 4.2.9 with Postgres, if it matters.

* Why Don't You... *

Add a field: I don't have control over the database schema. If I did, I'd add a new column as a flag for this group. Or a computed column that concatenates the values into a string. Or something... But I can't.

Use raw SQL: Yeah...I might do that if I can't do it through the ORM.



Solution 1:[1]

If you want exactly that structure then you can do things like this:

pairs = [['A', 1], ['A', 3], ['Q', 9]]
RejectionCode.where('(code1, code2) in ((?), (?), (?))', *pairs)

Of course, pairs.length presumably won't always be three so you could say:

pairs = [['A', 1], ['A', 3], ['Q', 9]]
placeholders = (%w[(?)] * pairs.length).join(', ')
RejectionCode.where("(code1, code2) in (#{placeholders})", *pairs)

Yes, that's using string interpolation to build an SQL snippet but it is perfectly safe in this case because you're building all the strings and you know exactly what's in them. If you put this into a scope then at least the ugliness would be hidden and you could easily cover it with your test suite.

Alternatively, you could take advantage of some equivalences. An in is a fancy or so these do roughly the same thing:

c in (x, y, z)
c = x or c = y or c = z

and records (even anonymous ones) are compared column by column so these are equivalent:

(a, b) = (x, y)
a = x and b = y

That means that something like this:

pairs = [['A', 1], ['A', 3], ['Q', 9]]
and_pair = ->(a) { RejectionCode.where('code1 = ? and code2 = ?', *a) }
and_pair[pairs[0]].or(and_pair[pairs[1]]).or(and_pair[pairs[2]])

should give you the same result. Or more generally:

pairs = [['A', 1], ['A', 3], ['Q', 9], ... ]
and_pair = ->(a) { RejectionCode.where('code1 = ? and code2 = ?', *a) }
query = pairs[1..-1].inject(and_pair[pairs.first]) { |q, a| q.or(and_pair[a]) }

Again, you'd want to hide this ugliness in a scope.

Solution 2:[2]

* This is a decent workaround, but not exactly a solution to the ORM question *

Failing to find the right way to do this in ActiveRecord, I just guessed, hoping for the best:

class ApprovalCode < ActiveRecord::Base

  REJECTION_CODES = [
    ['A', '0'],
    ['R', '1'],
    ['R', '5'],
    ['R', '6'],
    ['X', 'F'],
    ['X', 'G']
  ]

  scope :rejection_allowed, -> { where([:code, :sub_code], REJECTION_CODES) }  # This didn't work.

end

That did not work. So, I used raw SQL in the scope, and this did work:

  scope :rejection_allowed, -> { where("(code, sub_code) in (#{rejection_list})") }

  def self.rejection_list
    REJECTION_CODES
      .map{|code, sub_code| "('#{code}', '#{sub_code}')"}
      .join(', ')
  end

I am still hopeful to find how to do this in the ORM, or read suggestions on completely different approaches to the problem. Since it's all encapsulated in a scope and a constant, it will be trivial to refactor later, and keeping the constants and the scope separate will allow for painless tests.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mu is too short
Solution 2 David Hempy