Re: set find
  Home FAQ Contact Sign in
comp.lang.c++.moderated only
 
Advanced search
POPULAR GROUPS

more...

comp.lang.c++.moderated Profile…
 Up
Re: set find         


Author: John
Date: Jul 23, 2007 07:47

On Jul 21, 9:41 am, Erik Wikstr
8 Comments
Re: set find         


Author: Roman.Perepelitsa
Date: Jul 24, 2007 02:28

> There is a set::insert where you can pass in a hint iterator, I wish
> there was a set::find version that took hint(s). Basically I have
> have two sets - one big set and one fairly small set. I am trying to
> do a set intersection. Both are sets of strings. The small set
> contains strings all with the same prefix. Eg. all have the prefix
> "foo" for instance. It would be great if I could pass a hint to find
> or even better if I could pass an upper and lower bounds - so find
> could possibly take advantage of the information that I have passed
> in.

You can try 2 optimizations.

1. Find intersection for your small set and subset of large set.
Subset is found with the help of lower_bound and upper_bound.
Unfortunately, you have to create temporary set in this case.
2. Use alternative intersection algorithm with complexity
O(log(size_of_set1) * size_of_set2).

Note that if you use sorted vector instead of set, then you can
use first approach (with subset) without the need for temporary
set.

Here is some code.
Show full article (2.65Kb)
no comments
Re: set find         


Author: John
Date: Jul 24, 2007 11:37

On Jul 24, 6:14 am, "Roman.Perepeli...@gmail.com"
gmail.com> wrote:
>> There is a set::insert where you can pass in a hint iterator, I wish
>> there was a set::find version that took hint(s). Basically I have
>> have two sets - one big set and one fairly small set. I am trying to
>> do a set intersection. Both are sets of strings. The small set
>> contains strings all with the same prefix. Eg. all have the prefix
>> "foo" for instance. It would be great if I could pass a hint to find
>> or even better if I could pass an upper and lower bounds - so find
>> could possibly take advantage of the information that I have passed
>> in.
>
> You can try 2 optimizations.
>
> 1. Find intersection for your small set and subset of large set.
> Subset is found with the help of lower_bound and upper_bound.
> Unfortunately, you have to create temporary set in this case.
> 2. Use alternative intersection algorithm with complexity
> O(log(size_of_set1) * size_of_set2).
> ...
Show full article (3.33Kb)
no comments
Re: set find         


Author: Roman.Perepelitsa
Date: Jul 25, 2007 10:47

> The above looks great - except I have one question here..
> The creation of:
>
> Set sub(
> big.lower_bound(*small.begin()),
> big.upper_bound(*small.rbegin()));
>
> This could be somewhat expensive.
>
> If C++ had idea of subsets of a set that did not have to copy
> to create the new set this would be perfect.

You are right, this copy can be expensive. But if 'small' contains
only a few objects, and sub contains only a few objects, then it
can be faster than other solutions.
Show full article (2.80Kb)
no comments
Re: set find         


Author: Carl Barron
Date: Jul 25, 2007 16:57

In article <1185362774.896921.277440@o61g2000hsh.googlegroups.com>,
<"Roman.Perepelitsa@gmail.com"> wrote:
> // complexity is O(log(e1 - b1) * (e2 - b2))
> template
> void log_intersection(RA1 b1, RA1 e1,
> RA2 b2, RA2 e2, Out out)
> {
> for (; b2 != e2; ++b2)
> if (binary_search(b1, e1, *b2))
> *out++ = *b2;
> }
what makes you think that this is faster std::set_intersection?
its slower!! set_intersection be faster since log2(N1) > 1.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
no comments
Re: set find         


Author: Roman.Perepelitsa
Date: Jul 26, 2007 07:17

> In article <1185362774.896921.277...@o61g2000hsh.googlegroups.com>,
>
> <"Roman.Perepeli...@gmail.com"> wrote:
>> // complexity is O(log(e1 - b1) * (e2 - b2))
>> template
>> void log_intersection(RA1 b1, RA1 e1,
>> RA2 b2, RA2 e2, Out out)
>> {
>> for (; b2 != e2; ++b2)
>> if (binary_search(b1, e1, *b2))
>> *out++ = *b2;
>> }
>
> what makes you think that this is faster std::set_intersection?
> its slower!! set_intersection be faster since log2(N1) > 1.

std::set_intersection complexity is O(N1 + N2). Complexity of
log_intersection is log(N1) * N2. If N2 < N1 / (log(N1) - 1) then
log_intersection is faster. It is the case when N2 is very small,
and N1 is very large.
Show full article (0.94Kb)
no comments
Re: set find         


Author: qjohnny2000
Date: Jul 26, 2007 13:37

On Jul 25, 2:39 pm, "Roman.Perepeli...@gmail.com"
gmail.com> wrote:
>> The above looks great - except I have one question here..
>> The creation of:
>
>> Set sub(
>> big.lower_bound(*small.begin()),
>> big.upper_bound(*small.rbegin()));
>
>> This could be somewhat expensive.
>
>> If C++ had idea of subsets of a set that did not have to copy
>> to create the new set this would be perfect.
>
> You are right, this copy can be expensive. But if 'small' contains
> only a few objects, and sub contains only a few objects, then it
> can be faster than other solutions.
>
> std::set is odd thing. Range [s.begin(), s.end()) is always sorted,
> but you can't use effective std::lower_bound or std::binary_search ...
Show full article (3.56Kb)
no comments
Re: set find         


Author: Carl Barron
Date: Jul 28, 2007 01:37

Roman.Perepelitsa@gmail.com gmail.com> wrote:
>> In article <1185362774.896921.277...@o61g2000hsh.googlegroups.com>,
>>
>> <"Roman.Perepeli...@gmail.com"> wrote:
>>> // complexity is O(log(e1 - b1) * (e2 - b2))
>>> template
Show full article (3.89Kb)
no comments
Re: set find         


Author: Francis Glassborow
Date: Jul 28, 2007 10:07

Roman.Perepelitsa@gmail.com wrote:
>> In article <1185362774.896921.277...@o61g2000hsh.googlegroups.com>,
>>
>> <"Roman.Perepeli...@gmail.com"> wrote:
>>> // complexity is O(log(e1 - b1) * (e2 - b2))
>>> template
>>> void log_intersection(RA1 b1, RA1 e1,
>>> RA2 b2, RA2 e2, Out out)
>>> {
>>> for (; b2 != e2; ++b2)
>>> if (binary_search(b1, e1, *b2))
>>> *out++ = *b2;
>>> }
>> what makes you think that this is faster std::set_intersection?
>> its slower!! set_intersection be faster since log2(N1) > 1.
>
> std::set_intersection complexity is O(N1 + N2). Complexity of
> log_intersection is log(N1) * N2. If N2 < N1 / (log(N1) - 1) then
> log_intersection is faster. It is the case when N2 is very small,
> and N1 is very large. ...
Show full article (1.48Kb)
no comments