Is Oracle always the best
tool or application to manipulate data? The answer lies within what you mean by
manipulate. Data can be transformed via data manipulation language, and it can
also be transformed via string or numerical manipulation. To distinguish
between the types of manipulation, let’s say that stored data is in the purview
of DML, and output or displayed data is what or how stored data is presented to
the user. A simple example of this is a date. What is stored can be forced to
display in numerous ways. March 8th, 2007 is just as valid an output
as 03/08/2007; it just depends on what you or your users want to see.
The point of this article
concerns how you get from what is stored to what is displayed. The construct
for this point revolves around a string/numeric manipulation problem. Suppose
you have an application that supports surveys or offers a coupon to users via
the Internet. After clicking Submit on an online survey form, it is very common
for your identifier, whether it is a survey ID or an account number, to be
passed back as one of many name-value pairs in a URL.
Another use of this “pass
back data” transport mechanism is to validate or limit your use of a
promotional coupon. Coupons, whether paper or electronic in form, are typically
represented in a numeric format, and quite often, as very large numbers. The
very large number aspect makes it hard to counterfeit or fake numbers since the
density of numbers can be made to be sparse. Issue one million coupons with
barcode numbers ranging from 1,000,000 to 1,999,999 and you have a density of
one. Using a range of 77,000,000,000 to 78,000,000,000 and finding a valid
coupon number just became the equivalent of searching for a needle in the
haystack.
The problem or constraint,
however, is that passing back a 22-digit barcode number consumes valuable space
or length in the URL (using the 255 or 256 character limit). What we need is a
way to shorten up the big number into something a bit smaller/shorter. The
method is what leads us into the “how” of how are we going to accomplish this.
One method is to change the
base of the number. If you take a base-10 number and convert it into a base-36
number, you save some space. Using Oracle, how would you perform the
conversion?
A PL/SQL approach
Let’s look at a PL/SQL
approach first. The code has been optimized a bit, in that the factorization
has a jump-start in terms of position or iterations already taken into account.
It is also based on the input being close to at least 22 digits long.
CREATE OR REPLACE FUNCTION gen_barcode36 (i_coupon_number NUMBER)
RETURN VARCHAR2 IS
v_number number;
v_curval number;
v_curinc number := power(36,14);
v_true number := 0 ;
v_pos number := 15;
v_dec_count number := 0 ;
v_pos_val number := 0 ;
v_cur_num number := 0 ;
v_test_num number := 0 ;
v_new_num number := 0 ;
v_cur_digit char := '';
v_new_val varchar2(15) := '';
BEGIN
v_number := i_coupon_number;
while ( v_true = 0 ) loop
v_curinc := v_curinc * 36;
v_curval := v_number / v_curinc;
if ( v_curval < 1 ) then
v_true := 1;
else
v_pos := v_pos + 1;
end if;
end loop;
v_dec_count := v_pos;
v_new_val := NULL;
v_cur_num := v_number;
WHILE ( v_dec_count > 0) LOOP
v_pos_val := power(36,v_dec_count - 1);
v_test_num := trunc(v_cur_num/v_pos_val);
select decode(v_test_num,35,'z', 34,'y', 33,'x',
32,'w', 31,'v', 30,'u',
29,'t', 28,'s', 27,'r',
26,'q', 25,'p', 24,'o',
23,'n', 22,'m', 21,'l',
20,'k', 19,'j', 18,'i',
17,'h', 16,'g', 15,'f',
14,'e', 13,'d', 12,'c',
11,'b', 10,'a', 9,'9',
8,'8', 7,'7', 6,'6',
5,'5', 4,'4', 3,'3',
2,'2', 1,'1', '0')
INTO v_cur_digit FROM DUAL;
IF ( v_new_val IS NOT NULL ) THEN
v_new_val := v_new_val || v_cur_digit;
ELSE
v_new_val := v_cur_digit;
END IF;
v_cur_num := v_cur_num - (v_pos_val * v_test_num);
v_dec_count := v_dec_count - 1;
END LOOP;
RETURN v_new_val;
END gen_barcode36;
/
A different version of this
has some output attached to it so you can see how the number is reduced.
SQL> select gen_barcode36(7700000000000000000000) from dual;
GEN_BARCODE36(7700000000000000000000)
-----------------------------------------------------
1950zn8fqxjygow
Starting with 7700000000000000000000
Position: 6140942214464815497216 current number: 1
Position 14 current value: 1
Position: 170581728179578208256 current number: 9
Position 13 current value: 19
Position: 4738381338321616896 current number: 5
Position 12 current value: 195
Position: 131621703842267136 current number: 0
Position 11 current value: 1950
Position: 3656158440062976 current number: 35
Position 10 current value: 1950z
Position: 101559956668416 current number: 23
Position 9 current value: 1950zn
Position: 2821109907456 current number: 8
Position 8 current value: 1950zn8
Position: 78364164096 current number: 15
Position 7 current value: 1950zn8f
Position: 2176782336 current number: 26
Position 6 current value: 1950zn8fq
Position: 60466176 current number: 33
Position 5 current value: 1950zn8fqx
Position: 1679616 current number: 19
Position 4 current value: 1950zn8fqxj
Position: 46656 current number: 34
Position 3 current value: 1950zn8fqxjy
Position: 1296 current number: 16
Position 2 current value: 1950zn8fqxjyg
Position: 36 current number: 24
Position 1 current value: 1950zn8fqxjygo
Position: 1 current number: 32
Position 0 current value: 1950zn8fqxjygow
Final value: 1950zn8fqxjygow
This isn’t meant to be
security through obscurity, although a relatively strange looking string will
discourage most users from trying to figure out what it represents.
One at a time, the data
manipulation performance is okay, but how long would it take to generate a
million converted values? Let’s make a table with three columns. The first is
the base-10 number, the second is the base-36 converted/manipulated value using
the PL/SQL code above, and the third is reserved for the output using a
different means to manipulate the numbers. Create the table and populate it
with a million records. Turn timing on and see how long it takes to generate
the base-36 values.
SQL> create table base36
2 (barcode number,
3 plsql_ver varchar2(15),
4 other_ver varchar2(15));
Table created.
SQL> create sequence barcode_seq start with 7700000000000000000000;
Sequence created.
SQL> begin
2 for i in 1..1000000 loop
3 insert into base36 (barcode) values (barcode_seq.nextval);
4 end loop;
5 end;
6 /
PL/SQL procedure successfully completed.
SQL> commit;
Commit complete.
SQL> set timing on
SQL> update base36 set plsql_ver = gen_barcode36(barcode);
1000000 rows updated.
Elapsed: 00:33:09.01
The rate works out to be
just over 500 updates per second.
Might there be a faster way
outside of PL/SQL to accomplish this task? This problem is a good example of
when to use a better language, so to speak, than PL/SQL to transform data. A
better language or approach in this case is to manipulate the string using Java
and a built-in method (toString).
How do you get Java code into a database?
First, we need to have a
java file with the applicable Java code in it. Second, is to upload the code
(either the source or compiled version) into the database. Third is to compile
the code, and then the last step is to create a wrapper function or procedure
around the Java code. The function calls the Oracle-named object, which in turn
calls the Java-named object.
The source file/Java code is
pretty straightforward.
import java.math.BigInteger;
import java.lang.String;
public class BCUtils
{
public static String getBarcode36(String barcode10)
{
String value = new String(barcode10);
BigInteger bigI = new BigInteger(value.toString());
StringBuffer result = new StringBuffer(bigI.toString(36));
return result.toString();
}
}
Load the source file and
compile it.
Create a wrapper function/publish
the class.
SQL> create or replace function get_bc36(bc_10 varchar2)
2 return varchar2
3 as language java name
4 'BCUtils.getBarcode36(java.lang.String) return java.lang.String';
5 /
Function created.
Now we’re ready to test the
difference.
SQL> update base36 set other_ver = get_bc36(barcode);
1000000 rows updated.
Elapsed: 00:13:56.22
The rate above works out to
almost 1200 per second, well more than twice as fast as the PL/SQL version. Why
is that? Oracle acknowledges that PL/SQL isn’t the fastest language in the
world or the best at string manipulation.
A more formal list of steps
is shown in the section titled “Java
Stored Procedure Steps” in the Java Developer’s Guide.
In Closing
PL/SQL can do many things
well, but other languages can do some things much better, with “better” being
measured in terms of speed. If you find that stored data must be manipulated
for display type output, don’t be afraid to try other languages supported by
Oracle.
Back to DBAsupport.com