[Pgbigm-hackers] triconsistent function support for pg_bigm

Back to archive index

Fujii Masao masao****@gmail*****
2015年 8月 5日 (水) 14:07:11 JST


On Fri, Jul 24, 2015 at 12:01 PM, Amit Langote <amitl****@gmail*****> wrote:
>
>
> On Friday, July 24, 2015, Fujii Masao <masao****@gmail*****> wrote:
>>
>> Hi,
>>
>>
>> http://www.postgresql.org/message-id/E1ZHC****@gemul*****
>>
>> The above commit introduced the triconsistent function into pg_trgm GIN
>> opclass to improve the performance of full text search using that opclass.
>> So what about applying this change into also pg_bigm for better
>> performance?
>> Attached is the WIP patch which adds the triconsistent function into
>> pg_bigm.
>>
>> One big side effect by this patch is that the patched pg_bigm can no
>> longer
>> be compiled with PostgreSQL server 9.3 or before. This is because
>> the triconsistent API in GIN index is supported only in 9.4 or later. I
>> think
>> that we can live with this situation if we keep maintenacing and providing
>> the current stable versoin (i.e., 1.1) for 9.3 or before. Thought?
>
>
> +1
>
> From Jeff Janes's pg_trgm 1.2 proposal, it seems clear that triconsistent
> functions may be a win for longer search strings. Perhaps pg_bigm will gain
> immensely from that.

I performed the simple benchmark test and checked whether triConsistent
can improve the performance of pg_bigm full text search.

In the test, I used Japanese wikipedia title data. The number of records
was 2,487,075. I executed full text search with the specified keyword
containing 24 Japanese characters, ten times. Then I chose the average
of those response time as the final result. Here are the results of benchmark.

33.668 ms (1.1)
  4.131 ms (1.2)
  3.397 ms (1.1 with gin_key_limit = 8)
  2.324 ms (1.2 with gin_key_limit = 8)

This result shows us that when the number of characters is large in
full text search keyword, its response time can be very large in 1.1.
We can improve the performance in this case by upgrading pg_bigm
to 1.2 or using gin_key_limit parameter. 1.2 is better than 1.1 even when
gin_key_limit is used.

Regards,

-- 
Fujii Masao




Pgbigm-hackers メーリングリストの案内
Back to archive index