Wednesday, June 3, 2020

How does Shopify Search work ?

Recently, I read a research article about Shopify Search at https://www.reddit.com/r/ShopifySEO/comments/gwa4rj/shopify_search_engine/ . I want to supplement my research, but the comment has a maximum of 10,000 words, so I would like to separate it into a separate topic for more complete discussion.

Search engines are programs that search for data and information for a particular keyword and return a list of documents related to the keyword searched. A search engine is basically a class of software and programs, but today people often describe search engines as websites like Google, Bing, Yahoo, etc.

Web Search Engines

  • Web search engines send spiders to gather as much data as possible. Another program, called an indexer, will then read the documents and create indexes based on the words in the text. Each search engine will use different analytical algorithms to produce results that are most relevant to the query.
  • As more and more websites rely almost entirely on search engines for their website effectiveness, the industry related to search engines has grown rapidly in recent years around the idea of ​​optimization. Website content to appear as high as possible in search results.
  • See also: Web search engine - Wikipedia

Common types of Search Engines

  • Along with Web search engines, there are other forms as follows:
    • Local (offline) Search Engine: designed to work offline on computers, PCs, CDROM or in LAN
    • Metasearch Engine: also known as a combination of Search engines, they work by querying the other, etc. Search engines and then synthesizing to produce the most relative results.
    • Blog search engine: is a type of search engine for blog content, crawling and indexing together providing information in the Web blog field. One example is Shopify Search

How do search engines work?

  • For good SEO, you need to know the operating process of a Search Engines, especially Google. Knowing who knows we can win the battle.
  • Each search engine will use a different complex algorithm to synthesize and analyze the results. The results for a given query will then be displayed on a SERP (Search Engine Result Page). The search algorithm uses the components of a web page to operate, including the page title, content and number of keywords, and then ranks (ranking) to determine the order for the SERP page.
  • Different search engines have different algorithms, so sometimes you will see a top page of Yahoo is located somewhere in Google and vice versa. These algorithms are extremely confidential and highly insured. That means to succeed with SEO, it's nothing more than testing - failing - drawing. But the search engine's algorithms are always changing, from time to time, from event to event. SEOs are a tough but fascinating profession and must be constantly learned.
  • For many people, Google is synonymous with the Internet. That is the default homepage when they turn on the computer and start surfing the web, which is the gateway to the internet. Without search engines, content on the internet that can only reach readers through traditional channels in a difficult and slow way, is not reliable enough anymore.
  • Do you know how a search engine works? There are three basic stages in which a search engine works:
    • Crawling: discover informative content
    • Indexing: analysis and database storage
    • Retrieval: retrieves information, returns results page to searchers

1) Crawling

  • Crawling is the source of everything, it tracks and captures the data of a website. It consists of horizontal scanning of web pages and retrieving all information about that website - the page title, images, keywords that site contains, and links to other sites. Modern crawlers can also create a copy of the website itself, a complete copy, along with information such as the layout of the website, ad units, links, signs, unusual issues. (hidden signatures, wrong content, harm).
  • How is a website created? An automated blog, called a spider (not a spider-man), visits each web page, the way you browse each website, but it's silently and faster than us, and it reads it. grinding all the content on that page, every page. Even in the early years of Google, they were able to crawl hundreds of pages a second with just one spider.
  • Search engines will then make a list of all the links present on that page, and continue to crawl one by one, in the same way that the last website crawled. It is a process without end. It was awful and respectable.
  • All sites that have backlinks on other websites will automatically be indexed and crawled twice (actually only one, but the rank will double), or when you manipulate manually, update a new post for website (by way of fetching data for Google, for example), your site will also be will be crawled. Some websites are crawled more often if there are more backlinks. If a website is too huge and contains many links, it is possible that spiders will not scan to the end, depending on the search engine's algorithm, because every time you go deeper than a layer by following the links on the page, is the workload becomes a multiple. For example, a web page usually has 50 links on a page, and it is 7 floors deep, the amount of content that bots must crawl will be 50 to 7. There are ways to request the search engine NOT to index a page.
  • There was a time when there were dark parts on the internet, when no information about those areas appeared on SERP sites. They are called deep-webs (the Silk Road website, the recently broken bitcoin drug trade is a deep-web), but today they are much rarer. Webdeeps are accessed in a different way, TOR-hosted, so they are also crawled in example not in the usual way.

2) Indexing

  • Indexing is a process of arranging all the information gained from the crawling process and putting them into a huge database. I still can't figure out how Google is able to produce results almost immediately amid such a tremendous amount of information. Although they have powerful supercomputers, if not arranged scientifically, a query always takes quite a long time to be able to answer.
  • Imagine you are trying to make a list of the books you own, author names and page numbers. Browsing each book is Drawing and listing is Indexing. But now imagine that it's not just your bookshelf, but a library of a school or a library of ... cities. It is a simulation of the work that Google is doing every day.
  • All this data is stored in data centers with hard drives of thousands of petabytes.

3) Ranking & Retrieval

  • The final step is what you see on the screen, after you enter the search term, then the search engine displays the results that it thinks are most relevant to the issue you're looking for. This is a very complicated, not simple step. This is also a feature to distinguish the differences between search engines (there is a lot of information to prove that Copy search results from Google).
  • Sorting algorithms, rankingwebsite helps bring out the most relevant links for the query. This process is very complicated and dangerous to search engine survival, so they will usually protect the algorithm to the maximum extent. Why is that? By always having competitors, lurking for you loopholes. As long as search enginecon produces the most accurate results, it will be trusted and lead the market in use. In addition to avoiding abuse and taking advantage of SEOs, which any web site wants instead of focusing on quality content.
  • When a technology is no longer a secret, it is always at risk of being hacked, exploited and used to deceive the results.
  • Exploiting the hole in ranking algorithm was probably born right after the search engines appeared, but in the last 3 to 4 years, Google has struggled strongly with this. Basically, the site is based on the keyword that it has been queried many times. This makes spam keywords once used to be the simplest but most effective SEO technique. And what is the user? What is the Internet? A pile of rubbish is nothing more.
  • That's when Google came up with the concept of linking and made it an important attribute of ranking. A website is ranked when there are many links to it, but then it leads to a generation of Link Spamming generation.
  • Now each blog has a unique value that is not homogeneous, and the value of theink depends on the authority of the website. (You have links from high-quality websites like Mozilla, BBC, Wiki, .govhay websites, so its value is much higher than links from classifieds, blog copy websites, and your website is also ranked higher).
  • Today, search and ranking algorithms secrets have become national secrets, and perhaps as important as the secrets of making nuclear weapons. The art of search engine optimization is so focused today on creating quality content, because what else do they know to abuse SEO?
  • 60% of searchers often end up clicking on the first search result, so a high ranking in the SERP is vital for a website.
  • Did you know: Quora's search bar should also be considered an illustrative example

More Information

  • With current search engines like Shopify Search, Facebook Search or plugins like Ultimate Product Search, AI and ML are always the top issues. . Today's software developers focus a lot on developing the search engine's self-learning ability, instead of making rigid rules and requiring it to work exactly what has been programmed. . One of the first aspects to mention is the use of BoW and TF-IdF to process natural language. With these algorithms (which are also relatively basic), search engines will be able to make more accurate inferences. For example, 'TShir' => 'T-Shirt'. This is extremely beneficial for any business, as it minimizes the number of queries with 'NoResult', bringing more search results and increasing user experience. Search engines can also study Linear Regression, which helps search engines predict the trend of customers, maximizing revenue efficiency.

No comments:

Post a Comment