How AI-curated news feeds are engineered

Engineering AI-powered news feeds is a science, but it’s also a human-directed art. Quantumrun Foresight cannot guarantee that every article collected by the AI-curation feeds will be perfectly aligned with your research priorities. However, by providing Quantumrun Foresight with regular (weekly, bi-weekly, monthly) feedback about the quality of the curation, our research team can continue to tune the feeds to align more closely with your team’s research needs.

To gain a deeper understanding of how Quantumrun Foresight’s AI curation engine operates, we have prepared the following overview of the tools available to our curation engineers. 

Feed sources

Quantumrun indexes RSS feeds and Twitter home Timelines. Any articles we collect must be from websites that permit access from other websites for RSS feed collection. Quantumrun does not publish a list of all the sources we index because this list is dynamic and changes regularly.

If there are specific websites that do not contain RSS functionality that you want Quantumrun to source from, that will require a separate engineered solution available upon request. 

Platform AI tuning options

Quantumrun indexes hundreds of thousands of articles daily but only imports links (Signal posts) to articles/reports of relevance to the platform’s Business and Enterprise subscription users. Once in the platform, you can bookmark these Signals and organize your feed research into Lists, and then convert your Lists into visual foresight projects.

Still, your team may find that you want to receive more results. This can happen if a search is too narrow, or too specific, or not capturing enough keywords. Luckily with a few simple tricks, Quantumrun’s curation engineers can easily expand your results.

Quantumrun can:

  • Change the minimum popularity
  • Sharpen the current keywords
  • Tweak the Boolean search
  • Add more keywords, and
  • Add more sources

 

Change the minimum popularity

In Quantumrun’s Filter settings, our engineers can adjust the “minimum popularity” option. This sorts articles by how popular they are on social media. The “very high” setting for articles tends to draw from well-known, broadly read sources, while lower popularity settings may draw from more niche sources. 

Sharpen the current keywords 

Quantumrun engineers start by looking for exact phrases, so we try to think carefully about how often that exact phrase will be used within a given article of interest to a client’s particular research objectives. 

Incorporate fuzzy matching

To match similar spellings, Quantumrun engineers can make a term fuzzy by adding a tilde and a fuzzy factor. E.g.,  ~color0.3 will match both “color” as well as “colour.” Here, the fuzzy factor was 0.3; the higher the fuzzy factor, the fuzzier the matches are.

Boosting

The Quantumrun ranking system delivers relevant content into your feed, and sometimes your team may want to prioritize some important keywords over others. That’s where boosting comes into play: basically, boosting allows your team to control the importance of a term in a search. 

To boost a term, Quantumrun engineers use the  ^ symbol with a boost factor (by the exponential of that number) at the end of the term. For instance, if we have a search that includes the keyword “Quantum” and want to boost this keyword, then we use the query Quantum^2. To boost a phrase, we can append the boost modifier after the closing quote: “quantum computing”^10.

Any terms that don’t have a field or boosting specified default to being searched in the title and body text fields. And the title gets a boost of ^25. Quantumrun engineers could accomplish the default behavior with the following term:  (title:water^25 body:water). This is a Boolean Or query that searches for the term “water” in the article’s title field with a boost factor of 25, and in the body field with no boost. This approach ranks articles with the term in the title higher than those that contain the term in the body.

Lastly, we can’t “negative boost” a keyword. 

Tweak the Boolean search

It could be how Boolean search interprets our commands. For example, if your team wants to learn about organic coffee beans, Quantumrun engineers have found that we will be able to generate more articles searching for “organic” AND “coffee beans” than “organic coffee beans.” There’s no guarantee that people writing about that subject will use that exact three-word phrase.

Another thing Quantumrun engineers can do is change the rules that help your team filter articles based on external content.

  • “contain the exact phrase”
    • We use this option to specify exact terms or phrases found in an article. 
    • Example: The exact term “water” matches “Water down the bridge” but doesn’t match “Watermelon Sugar.”
    • Matching is not case-sensitive.
  • “contain words starting with”
    • We use this option to specify word prefixes found in an article.
    • Example: The word prefix “water” matches both “Water down the bridge” and “Watermelon Sugar.”
  • “contain text similar to phrase”
    • We use this option to specify fuzzy search terms.
    • Example: The fuzzy search term “color” matches both “colour” and “color.”
  • “be shared with Hashtag”
    • We use this option to specify the hashtag an article was shared under on Twitter. Hashtags have to match exactly, although they are not case-sensitive.
  • “be from Web Domain ending with”
    • We use this to specify the domain suffix under which an article is hosted. This rule is useful to specify from which kind of websites we want to collect feed recommendations. We can, e.g., limit search to American websites by entering “.ca” in this rule with a Must application. Or we can exclude articles from a specific website by entering the Web Domain in this rule with a Must Not application.
  • “match advanced query”
    • We use this to specify advanced rules for matching content. The next section on advanced query syntax will provide all the details.

Advanced Query Syntax

Here are some expert options available for pinpointing our feeds.

Wildcards

Your keywords may include a bunch of variations. For example, you may want to learn more about paint, so in addition to “paint,” Quantumrun engineers can also search for “paints,” “painter,” “painters,” “painting,” and “paintings,” which would net you more articles. You can use a wildcard search to cut down on duplication just by using the * symbol. For instance, the query paint* would look for all of the above. 

Groupings 

Parentheses allow Quantumrun engineers to create queries with nested logic. For instance, to search for content that must contain either “information” or “technology” Quantumrun engineers would include the following term: (information technology).

Field specifiers

Field specifiers allow Quantumrun engineers to query a particular field in an article. If Quantumrun engineers don’t specify a field, the term will be matched against the article’s title and body text fields.

The following fields are available for searching:

  • Body searches in the article body only. Example: To find articles that have the term “apple” in their body text, Quantumrun engineers enter body:apple as one of the query terms.
  • Domain matches the domain suffix in the article’s URL. Quantumrun engineers use this to find articles from a given Web Domain, e.g., for geographic filtering. Domains are interpreted from right to left. This may be unexpected. So to match any “.uk” domains, Quantumrun engineers just enter domain:uk.
    • Example 1: To match articles from Web Domain ending in “.com.au”, enter domain:com.au
    • Example 2: To match articles from a specific Web Domain, enter domain:quantumrun.com.
  • Excerpt searches the first 300 characters in the article’s body text only. Sometimes, searching this field instead of the entire body will eliminate noisy results since the most important terms are typically found at the beginning of an article. Example: To search for articles that contain the term “content marketing” at the beginning of the body text, enter excerpt:”content marketing”
  • Hashtag finds articles that were shared on Twitter with this hashtag. Example: To find articles that were shared on Twitter with the “#beyonce” hashtag, enter the following: hashtag:beyonce.
  • Title searches in the article title only. Example: To find articles that contain the term “green tea” in their title, Quantumrun engineers enter the search term title: “green tea”

Add more sources

Quantumrun engineers could include more sources to expand the net that Quantumrun casts. Quantumrun engineers can create new feed collections in three ways: through RSS feeds, through your Twitter timeline, or through an OPML import. 

Quantumrun engineers could include specific sites into your feed collections via their RSS feeds, or draw from articles shared by your Twitter stream. The Twitter Home Timeline option is especially helpful because you carefully choose the thought leaders and influencers to follow. Rather than trying to skim your entire Twitter timeline, Quantumrun can surface all the articles being shared by accounts your team follows.

Languages

For now, Quantumrun limits our curation to English content only. However, if there is an interest in curating content from non-English sources, then that is doable with your collaboration. For example, we would need to translate relevant keywords and create a list of foreign websites that you would want us to collect content from. That would require a paid service to set up and cannot be included during the free trial.

Share this Post:

Stay Connected

Related Posts

Radar: How to use it

The Radar function on the Quantumrun platform is a powerful tool for automating and tracking research topics over time. Here’s how to use it effectively:

Read More »