I only have an idea of what needs to be done. If you look at the code in WP that I mentioned you’ll see a complex regular expression that WP uses to split a string into words. It splits the string and also removes characters that you don’t want to search by.
How are you doing the search/filtering now? What are you using?