use sed or something to change uppercase to wrapped lowercase
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I'd like to use sed or something similar to read in a text file and change all instances of uppercase phrases to lowercase wrapped with \textsc{ **** }.

So, for example,

THIS SENTENCE IS ALL CAPS except not really

should become

\textsc{this sentence is all caps} except not really

and

But this sentence has ONE capped word

should become

But this sentence has \textsc{one} capped word

whereas

This Sentence Has Many Caps

should remain

This Sentence Has Many Caps
awarded to Hoan Dang

Crowdsource coding tasks.

3 Solutions


Ruby:

string = "THIS SENTENCE IS ALL CAPS except not really
BUT this sentence has ONE capped word
This Sentence Has Many CAPS"

string = string.gsub(/([A-Z][A-Z\s]+)/) { |s|
  '\textsc{' + "#{s}".downcase.strip.to_s + '} '
}

puts string

Return:

\textsc{this sentence is all caps} except not really
\textsc{but} this sentence has \textsc{one} capped word
This Sentence Has Many \textsc{caps} 
this doesn't read in any file
suchow over 6 years ago

There's probably a more elegant way of doing it with awk, but this sed solution works for your test cases:

$ echo "THIS SENTENCE IS ALL CAPS except not really" | sed -r 's/(([A-Z.,()]+ ?)+[A-Z.,()]+)/\\textsc{\L\1}/'
\textsc{this sentence is all caps} except not really
$ echo "But this sentence has ONE capped word" | sed -r 's/(([A-Z.,()]+ ?)+[A-Z.,()]+)/\\textsc{\L\1}/'                                
But this sentence has \textsc{one} capped word
$ echo "This Sentence Has Many Caps" | sed -r 's/(([A-Z.,()]+ ?)+[A-Z.,()]+)/\\textsc{\L\1}/'                                        
This Sentence Has Many Caps

And standalone:

sed -r 's/(([A-Z.,()]+ ?)+[A-Z.,()]+)/\\textsc{\L\1}/'
Not quite. This fails when given, "Hello. This Sentence Has Many Caps", erroneously returning "Hello\textsc{L. T}his Sentence Has Many Caps".
suchow over 6 years ago
Winning solution

Since the problem becomes quite tricky with sed, so I used another mechanism awk.

Here is the terminal screen

~> cat old-file.txt
   THIS SENTENCE IS ALL CAPS except not really
   But this sentence has ONE capped word
   This Sentence Has Many Caps

~> cat tst.awk
   {
       while ( match( $0, /([[:upper:]]{2,}[[:space:]]*)+/) ) {
            rstart  = RSTART
            rlength = RLENGTH
            if ( match( substr($0,RSTART,RLENGTH), /[[:space:]]+$/) ) {
                rlength = rlength - RLENGTH
            }
            $0 = substr($0,1,rstart-1) \
                 "\\textsc{" tolower(substr($0,rstart,rlength)) "}" \
                 substr($0,rstart+rlength)
       }

       print
   }

~> awk -f tst.awk old-file.txt  > new-file.txt
~> cat new-file.txt
   \textsc{this sentence is all caps} except not really
   But this sentence has \textsc{one} capped word
   This Sentence Has Many Caps
View Timeline