Syllable counter in Ruby
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

There doesn't appear to be a good open source syllable counter in Ruby (or in any other language that I've seen). Hopefully this bounty will change that!

Please create a syllable counter (Syllable.rb), open source it on Github, and provide a link to the repo as your solution.

  • It should work like this: Syllable.count("some string") => 2

  • Your solution should rely on rules rather than dictionary lookups. Exceptions are inevitable, so it's okay to provide manual counts for certain words or parts of words, but please keep those to a minimum.

  • It should handle digits, ie Syllable.count("21") == Syllable.count("twenty one"). (This should come in handy).

  • It should handle popular symbols such as % ("percent"), $ ("dollar"), @ ("at"), www. ("double-u double-u double-u dot"), and other general examples you think are useful. Make it work with just a few of these exceptions- most importantly, it should be easy for others to extend your code to support extras.

  • Include a spec or test with the tests below. All assertions should pass, and the individual wordcounts should also be correct.

I'll pick the first solution that works with the provided examples (and gets all the individual wordcounts correct). Let me know if you have any questions.

Should work on the following examples:

Here's the minimum contents of the spec/test that you should include with your solution. These are presented in haiku form so they're easy to mentally verify. Please make sure the individual wordcounts are also correct (it's possible that a program could make the spec pass, but botch the individual word counts- that would not count as a correct solution). Also, please let me know if any of the examples below are incorrect.

h = "An old silent pond...
A frog jumps into the pond,
splash! Silence again."
assert Syllable.count(h) == 17

h = "The Heartbreak Hotel
May its register stay blank
Till my next visit."
assert Syllable.count(h) == 17

h = "If a wish comes true
Who do you thank for this feat?
Yourself for wishing."
assert Syllable.count(h) == 17

h = "Springbok, gnu, gazelle
Slaking their thirst in dawns light
Bent low to water"
assert Syllable.count(h) == 17

h = "African jungle
King of Beasts prowls on his search
Hungry for fresh meat"
assert Syllable.count(h) == 17

h = "Instant death stalking
Uncertainty grips the herd
Nervous hooves shuffle"
assert Syllable.count(h) == 17

h = "Lion creeps closer
Saliva dripping from fangs
Anticipation"
assert Syllable.count(h) == 17

h = "Sentinel bird shrieks
All flee at the same instant
Stampede on the Veldt."
assert Syllable.count(h) == 17

h = "Leaves drift down like boats
on a green to and fro sea
Winter approaches."
assert Syllable.count(h) == 17

h = "The black panther waits
unseen in leafy shadows
Small faun draws nearer."
assert Syllable.count(h) == 17

h = "Morning rush for train
Sad faces never smiling
Why do they worry?"
assert Syllable.count(h) == 17

h = "Daily toil for all
Lunchtime gossip backstabbing
Then return to work"
assert Syllable.count(h) == 17

h = "Home to young faces
No time for play...must work yet
Someone has to earn"
assert Syllable.count(h) == 17

h = "Children grow quickly
Independence, ho! beckons
Babies leave the nest"
assert Syllable.count(h) == 17

h = "Lie in bed alone
Thinking of love now long dead
Was it all for naught?"
assert Syllable.count(h) == 17

h = "after summer's rain 
God's promise is remembered 
glorious rainbow"
assert Syllable.count(h) == 17

h = "People united 
To secure their liberty 
Out of many, one"
assert Syllable.count(h) == 17

h = "Ire's crest: David's Harp 
Spring words 'Aaron Forever' 
Covenants of God"
assert Syllable.count(h) == 17

h = "Fighting for freedom, 
Fall of valiant soldier 
Resting in the Lord"
assert Syllable.count(h) == 17

h = "Patrick of England 
Preserving Erin's blood line 
Drove out the serpents"
assert Syllable.count(h) == 17

h = "'My sheep hear My voice' 
Christ did say, 'and I know them 
and they follow Me'"
assert Syllable.count(h) == 17

h = "Do NOT partake of 
Knowledge of good and evil 
Satan's fruit, broad lies"
assert Syllable.count(h) == 17

h = "Haiku will wake you 
Dullsville in brain sharpen quick 
Haiku no fake who."
assert Syllable.count(h) == 17

h = "Butterfly in class 
learns lessons along with kids. 
Excellent student."
assert Syllable.count(h) == 17

h = "First autumn morning:
the mirror I stare into
shows my father's face."
assert Syllable.count(h) == 17

h = "A giant firefly:
that way, this way, that way, this -
and it passes by."
assert Syllable.count(h) == 17

h = "Rainbow in the sky
Inspiration of nature
Relief from the storm"
assert Syllable.count(h) == 17

h = "Echoing mountains
Endless possibilities
God's gift to the earth"
assert Syllable.count(h) == 17

h = "Lie in bed alone
Thinking of love now long dead
Was it all for naught?"
assert Syllable.count(h) == 17

h = "Spheres of captured light
Tales of life's contradictions
Spectrum of the earth"
assert Syllable.count(h) == 17

h = "Trees reach to the sky
As if grasping for God's hands
Earth, heaven unite"
assert Syllable.count(h) == 17

h = "Wind blows, seagulls sing
The waves crash upon the shore
Earth, sea unite. BATCH!"
assert Syllable.count(h) == 17

h = "Spheres of captured light
Tales of life's contradictions
Spectrum of the earth"
assert Syllable.count(h) == 17

h = "Glistening waters
Heaven's glorious showers
Replenish the earth"
assert Syllable.count(h) == 17

h = "So peaceful and still
Tranquility takes over
Peace rules the night's air"
assert Syllable.count(h) == 17

h = "Innocent raccoons
Such gentle eyes, fragile lives
Sit still, watch them play"
assert Syllable.count(h) == 17

h = "Homeless dog crying
Why are humans oft cruel
Innocent victim"
assert Syllable.count(h) == 17

h = "Spoken words of love
They are my heart's warm sunshine
Brighten up my day"
assert Syllable.count(h) == 17

h = "Be my friend today
I can't promise you the stars
We will share my heart"
assert Syllable.count(h) == 17

h = "Two species, one place
Life's journeys so different
No prejudice here"
assert Syllable.count(h) == 17

h = "Exotic island
Wild dreams of passion and fun
Escape from the world"
assert Syllable.count(h) == 17

h = "Desolate and dry
Yearning for water, for life
God will speak and heal"
assert Syllable.count(h) == 17

h = "Leaves of orange flame
Autumn's magical embrace
Nature's spice of life"
assert Syllable.count(h) == 17

h = "Vine weeps of sadness
Maroon leaves sigh tears of pain
Nightmare camouflaged"
assert Syllable.count(h) == 17

h = "Winding trail through time
Blurred images of my past
Life's moments relived"
assert Syllable.count(h) == 17

h = "Compassion is good
Compassionate is better
Life's moments relived"
assert Syllable.count(h) == 17

h = "$2 soda
Blurred images of my past
Life's moments relived"
assert Syllable.count(h) == 17

h = "Sikkim, India, 
on December 21, 
2010."
assert Syllable.count(h) == 17

h = "Then no matter what 
others' attitudes are, you 
can keep inner peace."
assert Syllable.count(h) == 17

h = "Didn't won't aren't bad 
mustn't butt into convos
1! 2? 3^ 4* 5()"
assert Syllable.count(h) == 17

h = "Sacred gaffe homeless
Else year friend buy taste york piece
no diabetes"
assert Syllable.count(h) == 17
@bevan There is no real perfect syllable counter algorithm because there are many peculiarities with the english language as seen here, though I can see what I can do.
alex 9 years ago
@bevan Whoa! Sorry, didn't read your post thoroughly.
alex 9 years ago
@alex You're right, a perfect solution would be hard or impossible without using a dictionary approach. Just making it work with the tests above is fine in this case.
bevan 9 years ago
awarded to Wikimedia
Tags
ruby
syllable

Crowdsource coding tasks.

3 Solutions


Overview

There is no real perfect syllable counter algorithm because there are many peculiarities with the english language as seen here. So I cannot give you a perfect algorithm, though I am working on this which will try to account for peculiarities. It currently works for the tests 1-3, 5-33, 37, 45-47, 50

Note, this is a work in progress.

Edit(s)

None so far...


This works in most cases, it would be difficult to guarantee 100% accuracy

require 'rubygems'
require 'syllables'
require 'i18n'
require 'numbers_and_words'

final = []
count = 0
text = '23 57 The Heartbreak Hotel May its register stay blank Till my next visit.' 


def is_a_number?(s)
  s.to_s.match(/\A[+-]?\d+?(\.\d+)?\Z/) == nil ? false : true
end


words = text.split(' ');
words.each { | word |
    if is_a_number?(word)
        word = I18n.with_locale(:en) { Integer(word).to_words }
    end
    final.push(word)
}

final = final.join(' ')
Syllables.new(final).to_h.each { | word, val |
    count += val
}

puts count

This is an interesting and very complex problem, and as alex mentions, there is no good solution.

Here's something quick based on Lingua:

http://github.com/sj26/syllable

It passes all but 3 of your examples. Verify this by cloning locally, installing the humanize gem, and running ruby lib/syllable.rb --spec.

Update:

One failing example is fixed—the CMU dictionary's default pronunciation is "AA1 R AH0 N T". They include the alternate pronunciation "AA1 R N T" which I have edited to be the preferred in the dictionary.

Update 2:

Again with "firefly"—the dictionary had "F AY1 ER0 F L AY2" which I've modified to "F AY1 R F L AY2". Down to a single failing example.

Update 3:

The final failure is because Syllable.count("valiant") => 2 which in some regions is correct and others is false. You could manually set this entry if you wished, but I'm leaving it as "failing" because I think the dictionary has it right for common parlance.

Thanks for the cool solution. DBM's a great find! The reason I asked for a non-dictionary solution is because speed is more of a concern than accuracy (looking for a replacement counter for haiku.li). Haven't had time to test yet but will verify soon, have cloned it. Cheers!
bevan 9 years ago
Ah, I skimmed over the "no dictionary" bit evidently. The syllable counter included uses dictionary lookup and, failing that, guessing. The rules for guessing are pretty good. Supplemented with some exceptions like you suggest it should be adequate. Or perhaps a lightweight dictionary is acceptable in supplement.
sj26 9 years ago
View Timeline