Re: Help extracting something from a string
  Home FAQ Contact Sign in
comp.unix.shell only
 
Advanced search
POPULAR GROUPS

more...

 Up
Re: Help extracting something from a string         

Group: comp.unix.shell · Group Profile
Author: bone
Date: Nov 20, 2007 11:10

On Nov 19, 5:06 pm, Edward Rosten gmail.com> wrote:
> On Nov 19, 10:03 am, bone gmail.com> wrote:
>
>> I am having a hard time figuring this one out as the records I am
>> asked to work with "seem" rather arbitrary.
>
>> I have a stream of text and I need to extract a filename in the form
>> (bash wildcards) "*-*-*-*-*-*.pdf"
>> including the double-quotes, the characters surrounding it could be
>> anything at all.
>
> Do you have more that one per line? If so, that pattern will not work.
> Consider that the pattern is:
> "*-*.pdf"
>
> Then, the whole line will match the pattern:
>
> "a-b.pdf" junk junk junk junk junk junk "b-c.pdf"
>
>> I won't get into how I have tried to do this so far but let's just say
>> cut isn't cutting it and I am pretty unskilled with sed apparently.
>
> If you insist on space separation, and disallow spaces in the
> filename, the following will work:
>
> while read i
> do
> case "$i" in
> \"*-*-*-*-*.pdf\")
> echio $i;;
> esac
> done

I don't control the input, it will not be space delimited generally
though.
>
> If you want to allow spaces, and you have only one per line, this sed
> script will do:
> sed -ne's/.*\(".*-.*-.*-.*-.*-.*\.pdf"\).*/\1/;tp;d;:p;p'

this doesn't seem to work:

$ echo ksdjfglsdfg"ddfd-dfdf-dfdf-dfdf-dfdf-dfdfd-dfdf.pdf"sdgsg| sed
-ne's/.*\(".*-.*-.*-.*-.*-.*\.pdf"\).*/\1/;tp;d;:p;p'

doesn't return anything
no comments
diggit! del.icio.us! reddit!