Parallelizing Lookups with Common Table Expressions
Individual lookups against a rules system stored in a database can add up to a lot of computations and depending on the size of your ruleset cause delays. There are ways around this with caching, in memory hashmaps. However for this approach I used a Common Table Expression (CTE) to break apart the list of lookups and let SQL do them in parallel
In this case there is a text input of variable length, example: Hello
that needs to be translated letter by letter to a machine expected readable prompt. Here is the input/output for this
- Input:
Hello
- Output:
3222434353
Each character has a corresponding code output H
= 32
and so forth defined by a table of rules functioning as the lookup table with the general format of
CREATE TABLE Translations (
[InputCode] NCHAR,
[OutputCode] NVARCHAR(100)
)
So how to split the input which is in a word format by each character and perform the lookup (matching only where a lookup exists and proceeding onward with gaps so that ordering doesn't change)
Split the inputs
Use the STUFF and UNION ALL to split the characters and row orient them as so
STUFF will split the character out and reduce the input so that the next call to UNION ALL on the expression will reduce the input until it is split into component parts
DECLARE @textToTranslate NVARCHAR(MAX) = 'Hello'
SELECT STUFF(@textToTranslate,1,1,'') TXT, LEFT(@textToTranslate,1) [InputChar]
Gives you
TXT | InputChar |
---|---|
ello | H |
Now combine it with the next step
SELECT STUFF(TXT,1,1,'') TXT, LEFT(TXT,1) [InputChar] FROM [Splitter]
WHERE LEN(TXT) > 0
And turn it into a CTE
DECLARE @textToTranslate NVARCHAR(MAX) = 'Hello'
;WITH [Splitter] AS
(
SELECT STUFF(@textToTranslate,1,1,'') TXT, LEFT(@textToTranslate,1) [InputChar]
UNION ALL
SELECT STUFF(TXT,1,1,'') TXT, LEFT(TXT,1) [InputChar] FROM [Splitter]
WHERE LEN(TXT) > 0
)
SELECT *
FROM [Splitter] [s]
TXT | InputChar |
---|---|
ello | H |
llo | e |
lo | l |
o | l |
o |
Do the lookups
Now that the input text has been row oriented per character can left join it against the lookup table and get the output
DECLARE @textToTranslate NVARCHAR(MAX) = 'Hello'
;WITH [Splitter] AS
(
SELECT STUFF(@textToTranslate,1,1,'') TXT, LEFT(@textToTranslate,1) [InputChar]
UNION ALL
SELECT STUFF(TXT,1,1,'') TXT, LEFT(TXT,1) [InputChar] FROM [Splitter]
WHERE LEN(TXT) > 0
)
SELECT [s].[InputChar], [converter].[OutputCode]
FROM [Splitter] [s]
LEFT JOIN
(
SELECT *
FROM @translations
) AS [converter]
ON [s].[InputChar] = [converter].[InputCode]
InputChar | OutputCode |
---|---|
H | 32 |
e | 22 |
l | 43 |
l | 43 |
o | 53 |
Example translation table variable to use for testing the above
DECLARE @translations TABLE (
[InputCode] NCHAR,
[OutputCode] NVARCHAR(100)
)
INSERT INTO @translations ([InputCode], [OutputCode])
VALUES
('A', '11'),
('B', '12'),
('C', '13'),
('D', '21'),
('E', '22'),
('F', '23'),
('G', '31'),
('H', '32'),
('I', '33'),
('J', '41'),
('K', '42'),
('L', '43'),
('M', '51'),
('N', '52'),
('O', '53'),
('P', '61'),
('Q', '62'),
('R', '63'),
('S', '71'),
('T', '72'),
('U', '73'),
('V', '81'),
('W', '82'),
('X', '83'),
('Y', '91'),
('Z', '92')