Search This Blog

Sunday, May 16, 2021

Computer virus

From Wikipedia, the free encyclopedia

Hex dump of the Blaster worm, showing a message left for Microsoft co-founder Bill Gates by the worm's programmer

A computer virus is a type of computer program that, when executed, replicates itself by modifying other computer programs and inserting its own code. If this replication succeeds, the affected areas are then said to be "infected" with a computer virus.

Computer viruses generally require a host program. The virus writes its own code into the host program. When the program runs, the written virus program is executed first, causing infection and damage. A computer worm does not need a host program, as it is an independent program or code chunk. Therefore, it is not restricted by the host program, but can run independently and actively carry out attacks.

Computer viruses cause billions of dollars' worth of economic damage each year.

In 1989 The ADAPSO Software Industry Division published Dealing With Electronic Vandalism, in which they followed the risk of data loss by "the added risk of losing customer confidence."

In response, free, open-source anti-virus tools have been developed, and an industry of antivirus software has cropped up, selling or freely distributing virus protection to users of various operating systems.

Overview

Virus writers use social engineering deceptions and exploit detailed knowledge of security vulnerabilities to initially infect systems and to spread the virus. The vast majority of viruses target systems running Microsoft Windows, employing a variety of mechanisms to infect new hosts, and often using complex anti-detection/stealth strategies to evade antivirus software. Motives for creating viruses can include seeking profit (e.g., with ransomware), desire to send a political message, personal amusement, to demonstrate that a vulnerability exists in software, for sabotage and denial of service, or simply because they wish to explore cybersecurity issues, artificial life and evolutionary algorithms.

Damage is due to causing system failure, corrupting data, wasting computer resources, increasing maintenance costs or stealing personal information. Even though no antivirus software can uncover all computer viruses (especially new ones), computer security researchers are actively searching for new ways to enable antivirus solutions to more effectively detect emerging viruses, before they become widely distributed.

Other malware

The term "virus" is also misused by extension to refer to other types of malware. "Malware" encompasses computer viruses along with many other forms of malicious software, such as computer "worms", ransomware, spyware, adware, trojan horses, keyloggers, rootkits, bootkits, malicious Browser Helper Object (BHOs), and other malicious software. The majority of active malware threats are trojan horse programs or computer worms rather than computer viruses. The term computer virus, coined by Fred Cohen in 1985, is a misnomer.[24] Viruses often perform some type of harmful activity on infected host computers, such as acquisition of hard disk space or central processing unit (CPU) time, accessing and stealing private information (e.g., credit card numbers, debit card numbers, phone numbers, names, email addresses, passwords, bank information, house addresses, etc.), corrupting data, displaying political, humorous or threatening messages on the user's screen, spamming their e-mail contacts, logging their keystrokes, or even rendering the computer useless. However, not all viruses carry a destructive "payload" and attempt to hide themselves—the defining characteristic of viruses is that they are self-replicating computer programs that modify other software without user consent by injecting themselves into the said programs, similar to a biological virus which replicates within living cells.

Historical development

Early academic work on self-replicating programs

The first academic work on the theory of self-replicating computer programs[25] was done in 1949 by John von Neumann who gave lectures at the University of Illinois about the "Theory and Organization of Complicated Automata". The work of von Neumann was later published as the "Theory of self-reproducing automata". In his essay von Neumann described how a computer program could be designed to reproduce itself.[26] Von Neumann's design for a self-reproducing computer program is considered the world's first computer virus, and he is considered to be the theoretical "father" of computer virology.[27] In 1972, Veith Risak directly building on von Neumann's work on self-replication, published his article "Selbstreproduzierende Automaten mit minimaler Informationsübertragung" (Self-reproducing automata with minimal information exchange).[28] The article describes a fully functional virus written in assembler programming language for a SIEMENS 4004/35 computer system. In 1980 Jürgen Kraus wrote his diplom thesis "Selbstreproduktion bei Programmen" (Self-reproduction of programs) at the University of Dortmund.[29] In his work Kraus postulated that computer programs can behave in a way similar to biological viruses.

Science fiction

The first known description of a self-reproducing program in fiction is in the 1970 short story The Scarred Man by Gregory Benford which describes a computer program called VIRUS which, when installed on a computer with telephone modem dialing capability, randomly dials phone numbers until it hits a modem that is answered by another computer, and then attempts to program the answering computer with its own program, so that the second computer will also begin dialing random numbers, in search of yet another computer to program. The program rapidly spreads exponentially through susceptible computers and can only be countered by a second program called VACCINE.[30]

The idea was explored further in two 1972 novels, When HARLIE Was One by David Gerrold and The Terminal Man by Michael Crichton, and became a major theme of the 1975 novel The Shockwave Rider by John Brunner.[31]

The 1973 Michael Crichton sci-fi movie Westworld made an early mention of the concept of a computer virus, being a central plot theme that causes androids to run amok.[32] Alan Oppenheimer's character summarizes the problem by stating that "...there's a clear pattern here which suggests an analogy to an infectious disease process, spreading from one...area to the next." To which the replies are stated: "Perhaps there are superficial similarities to disease" and, "I must confess I find it difficult to believe in a disease of machinery."[33]

First examples

The MacMag virus 'Universal Peace', as displayed on a Mac in March 1988

The Creeper virus was first detected on ARPANET, the forerunner of the Internet, in the early 1970s.[34] Creeper was an experimental self-replicating program written by Bob Thomas at BBN Technologies in 1971.[35] Creeper used the ARPANET to infect DEC PDP-10 computers running the TENEX operating system.[36] Creeper gained access via the ARPANET and copied itself to the remote system where the message, "I'm the creeper, catch me if you can!" was displayed. The Reaper program was created to delete Creeper.[37]

In 1982, a program called "Elk Cloner" was the first personal computer virus to appear "in the wild"—that is, outside the single computer or computer lab where it was created.[38] Written in 1981 by Richard Skrenta, a ninth grader at Mount Lebanon High School near Pittsburgh, it attached itself to the Apple DOS 3.3 operating system and spread via floppy disk.[38] On its 50th use the Elk Cloner virus would be activated, infecting the personal computer and displaying a short poem beginning "Elk Cloner: The program with a personality."

In 1984 Fred Cohen from the University of Southern California wrote his paper "Computer Viruses – Theory and Experiments".[39] It was the first paper to explicitly call a self-reproducing program a "virus", a term introduced by Cohen's mentor Leonard Adleman. In 1987, Fred Cohen published a demonstration that there is no algorithm that can perfectly detect all possible viruses.[40] Fred Cohen's theoretical compression virus[41] was an example of a virus which was not malicious software (malware), but was putatively benevolent (well-intentioned). However, antivirus professionals do not accept the concept of "benevolent viruses", as any desired function can be implemented without involving a virus (automatic compression, for instance, is available under Windows at the choice of the user). Any virus will by definition make unauthorised changes to a computer, which is undesirable even if no damage is done or intended. The first page of Dr Solomon's Virus Encyclopaedia explains the undesirability of viruses, even those that do nothing but reproduce.[42][5]

An article that describes "useful virus functionalities" was published by J. B. Gunn under the title "Use of virus functions to provide a virtual APL interpreter under user control" in 1984.[43] The first IBM PC virus in the "wild" was a boot sector virus dubbed (c)Brain,[44] created in 1986 by Amjad Farooq Alvi and Basit Farooq Alvi in Lahore, Pakistan, reportedly to deter unauthorized copying of the software they had written.[45] The first virus to specifically target Microsoft Windows, WinVir was discovered in April 1992, two years after the release of Windows 3.0.[46] The virus did not contain any Windows API calls, instead relying on DOS interrupts. A few years later, in February 1996, Australian hackers from the virus-writing crew VLAD created the Bizatch virus (also known as "Boza" virus), which was the first known virus to target Windows 95. In late 1997 the encrypted, memory-resident stealth virus Win32.Cabanas was released—the first known virus that targeted Windows NT (it was also able to infect Windows 3.0 and Windows 9x hosts).[47]

Even home computers were affected by viruses. The first one to appear on the Commodore Amiga was a boot sector virus called SCA virus, which was detected in November 1987.[48]

Operations and functions

Parts

A viable computer virus must contain a search routine, which locates new files or new disks that are worthwhile targets for infection. Secondly, every computer virus must contain a routine to copy itself into the program which the search routine locates.[49] The three main virus parts are:

  • Infection mechanism (also called 'infection vector'): This is how the virus spreads or propagates. A virus typically has a search routine, which locates new files or new disks for infection.[50]
  • Trigger: Also known as a logic bomb, this is the compiled version that could be activated any time within an executable file when the virus is run that determines the event or condition for the malicious "payload" to be activated or delivered[51] such as a particular date, a particular time, particular presence of another program, capacity of the disk exceeding some limit,[52] or a double-click that opens a particular file.[53]
  • Payload: The "payload" is the actual body or data which carries out the malicious purpose of the virus. Payload activity might be noticeable (e.g., because it causes the system to slow down or "freeze"), as most of the time the "payload" itself is the harmful activity,[50] or some times non-destructive but distributive, which is called virus hoax.[54]

Phases

Virus phases is the life cycle of the computer virus, described by using an analogy to biology. This life cycle can be divided into four phases:

  • Dormant phase: The virus program is idle during this stage. The virus program has managed to access the target user's computer or software, but during this stage, the virus does not take any action. The virus will eventually be activated by the "trigger" which states which event will execute the virus. Not all viruses have this stage.[50]
  • Propagation phase: The virus starts propagating, which is multiplying and replicating itself. The virus places a copy of itself into other programs or into certain system areas on the disk. The copy may not be identical to the propagating version; viruses often "morph" or change to evade detection by IT professionals and anti-virus software. Each infected program will now contain a clone of the virus, which will itself enter a propagation phase.[50]
  • Triggering phase: A dormant virus moves into this phase when it is activated, and will now perform the function for which it was intended. The triggering phase can be caused by a variety of system events, including a count of the number of times that this copy of the virus has made copies of itself.[50] The trigger may occur when an employee is terminated from their employment or after a set period of time has elapsed, in order to reduce suspicion.
  • Execution phase: This is the actual work of the virus, where the "payload" will be released. It can be destructive such as deleting files on disk, crashing the system, or corrupting files or relatively harmless such as popping up humorous or political messages on screen.[50]

Infection targets and replication techniques

Computer viruses infect a variety of different subsystems on their host computers and software.[55] One manner of classifying viruses is to analyze whether they reside in binary executables (such as .EXE or .COM files), data files (such as Microsoft Word documents or PDF files), or in the boot sector of the host's hard drive (or some combination of all of these).[56][57]

Resident vs. non-resident viruses

A memory-resident virus (or simply "resident virus") installs itself as part of the operating system when executed, after which it remains in RAM from the time the computer is booted up to when it is shut down. Resident viruses overwrite interrupt handling code or other functions, and when the operating system attempts to access the target file or disk sector, the virus code intercepts the request and redirects the control flow to the replication module, infecting the target. In contrast, a non-memory-resident virus (or "non-resident virus"), when executed, scans the disk for targets, infects them, and then exits (i.e. it does not remain in memory after it is done executing).[58][59][60]

Macro viruses

Many common applications, such as Microsoft Outlook and Microsoft Word, allow macro programs to be embedded in documents or emails, so that the programs may be run automatically when the document is opened. A macro virus (or "document virus") is a virus that is written in a macro language and embedded into these documents so that when users open the file, the virus code is executed, and can infect the user's computer. This is one of the reasons that it is dangerous to open unexpected or suspicious attachments in e-mails.[61][62] While not opening attachments in e-mails from unknown persons or organizations can help to reduce the likelihood of contracting a virus, in some cases, the virus is designed so that the e-mail appears to be from a reputable organization (e.g., a major bank or credit card company).

Boot sector viruses

Boot sector viruses specifically target the boot sector and/or the Master Boot Record[63] (MBR) of the host's hard disk drive, solid-state drive, or removable storage media (flash drives, floppy disks, etc.).[56][64][65]

The most common way of transmission of computer viruses in boot sector is physical media. When reading the VBR of the drive, the infected floppy disk or USB flash drive connected to the computer will transfer data, and then modify or replace the existing boot code. The next time a user tries to start the desktop, the virus will immediately load and run as part of the master boot record.[66]

Email virus

Email viruses are viruses that intentionally, rather than accidentally, uses the email system to spread. While virus infected files may be accidentally sent as email attachments, email viruses are aware of email system functions. They generally target a specific type of email system (Microsoft Outlook is the most commonly used), harvest email addresses from various sources, and may append copies of themselves to all email sent, or may generate email messages containing copies of themselves as attachments.[67]

Stealth techniques

To avoid detection by users, some viruses employ different kinds of deception. Some old viruses, especially on the DOS platform, make sure that the "last modified" date of a host file stays the same when the file is infected by the virus. This approach does not fool antivirus software, however, especially those which maintain and date cyclic redundancy checks on file changes.[68] Some viruses can infect files without increasing their sizes or damaging the files. They accomplish this by overwriting unused areas of executable files. These are called cavity viruses. For example, the CIH virus, or Chernobyl Virus, infects Portable Executable files. Because those files have many empty gaps, the virus, which was 1 KB in length, did not add to the size of the file.[69] Some viruses try to avoid detection by killing the tasks associated with antivirus software before it can detect them (for example, Conficker). In the 2010s, as computers and operating systems grow larger and more complex, old hiding techniques need to be updated or replaced. Defending a computer against viruses may demand that a file system migrate towards detailed and explicit permission for every kind of file access.[citation needed]

Read request intercepts

While some kinds of antivirus software employ various techniques to counter stealth mechanisms, once the infection occurs any recourse to "clean" the system is unreliable. In Microsoft Windows operating systems, the NTFS file system is proprietary. This leaves antivirus software a little alternative but to send a "read" request to Windows files that handle such requests. Some viruses trick antivirus software by intercepting its requests to the operating system. A virus can hide by intercepting the request to read the infected file, handling the request itself, and returning an uninfected version of the file to the antivirus software. The interception can occur by code injection of the actual operating system files that would handle the read request. Thus, an antivirus software attempting to detect the virus will either not be permitted to read the infected file, or, the "read" request will be served with the uninfected version of the same file.[70]

The only reliable method to avoid "stealth" viruses is to "reboot" from a medium that is known to be "clear". Security software can then be used to check the dormant operating system files. Most security software relies on virus signatures, or they employ heuristics.[71][72] Security software may also use a database of file "hashes" for Windows OS files, so the security software can identify altered files, and request Windows installation media to replace them with authentic versions. In older versions of Windows, file cryptographic hash functions of Windows OS files stored in Windows—to allow file integrity/authenticity to be checked—could be overwritten so that the System File Checker would report that altered system files are authentic, so using file hashes to scan for altered files would not always guarantee finding an infection.[73]

Self-modification

Most modern antivirus programs try to find virus-patterns inside ordinary programs by scanning them for so-called virus signatures.[74] Unfortunately, the term is misleading, in that viruses do not possess unique signatures in the way that human beings do. Such a virus "signature" is merely a sequence of bytes that an antivirus program looks for because it is known to be part of the virus. A better term would be "search strings". Different antivirus programs will employ different search strings, and indeed different search methods, when identifying viruses. If a virus scanner finds such a pattern in a file, it will perform other checks to make sure that it has found the virus, and not merely a coincidental sequence in an innocent file, before it notifies the user that the file is infected. The user can then delete, or (in some cases) "clean" or "heal" the infected file. Some viruses employ techniques that make detection by means of signatures difficult but probably not impossible. These viruses modify their code on each infection. That is, each infected file contains a different variant of the virus.[citation needed]

Encrypted viruses

One method of evading signature detection is to use simple encryption to encipher (encode) the body of the virus, leaving only the encryption module and a static cryptographic key in cleartext which does not change from one infection to the next.[75] In this case, the virus consists of a small decrypting module and an encrypted copy of the virus code. If the virus is encrypted with a different key for each infected file, the only part of the virus that remains constant is the decrypting module, which would (for example) be appended to the end. In this case, a virus scanner cannot directly detect the virus using signatures, but it can still detect the decrypting module, which still makes indirect detection of the virus possible. Since these would be symmetric keys, stored on the infected host, it is entirely possible to decrypt the final virus, but this is probably not required, since self-modifying code is such a rarity that finding some may be reason enough for virus scanners to at least "flag" the file as suspicious.[citation needed] An old but compact way will be the use of arithmetic operation like addition or subtraction and the use of logical conditions such as XORing,[76] where each byte in a virus is with a constant so that the exclusive-or operation had only to be repeated for decryption. It is suspicious for a code to modify itself, so the code to do the encryption/decryption may be part of the signature in many virus definitions.[citation needed] A simpler older approach did not use a key, where the encryption consisted only of operations with no parameters, like incrementing and decrementing, bitwise rotation, arithmetic negation, and logical NOT.[76] Some viruses, called polymorphic viruses, will employ a means of encryption inside an executable in which the virus is encrypted under certain events, such as the virus scanner being disabled for updates or the computer being rebooted.[77] This is called cryptovirology.

Polymorphic code

Polymorphic code was the first technique that posed a serious threat to virus scanners. Just like regular encrypted viruses, a polymorphic virus infects files with an encrypted copy of itself, which is decoded by a decryption module. In the case of polymorphic viruses, however, this decryption module is also modified on each infection. A well-written polymorphic virus therefore has no parts which remain identical between infections, making it very difficult to detect directly using "signatures".[78][79] Antivirus software can detect it by decrypting the viruses using an emulator, or by statistical pattern analysis of the encrypted virus body. To enable polymorphic code, the virus has to have a polymorphic engine (also called "mutating engine" or "mutation engine") somewhere in its encrypted body. See polymorphic code for technical detail on how such engines operate.[80]

Some viruses employ polymorphic code in a way that constrains the mutation rate of the virus significantly. For example, a virus can be programmed to mutate only slightly over time, or it can be programmed to refrain from mutating when it infects a file on a computer that already contains copies of the virus. The advantage of using such slow polymorphic code is that it makes it more difficult for antivirus professionals and investigators to obtain representative samples of the virus, because "bait" files that are infected in one run will typically contain identical or similar samples of the virus. This will make it more likely that the detection by the virus scanner will be unreliable, and that some instances of the virus may be able to avoid detection.

Metamorphic code

To avoid being detected by emulation, some viruses rewrite themselves completely each time they are to infect new executables. Viruses that utilize this technique are said to be in metamorphic code. To enable metamorphism, a "metamorphic engine" is needed. A metamorphic virus is usually very large and complex. For example, W32/Simile consisted of over 14,000 lines of assembly language code, 90% of which is part of the metamorphic engine.[81][82]

Vulnerabilities and infection vectors

Software bugs

As software is often designed with security features to prevent unauthorized use of system resources, many viruses must exploit and manipulate security bugs, which are security defects in a system or application software, to spread themselves and infect other computers. Software development strategies that produce large numbers of "bugs" will generally also produce potential exploitable "holes" or "entrances" for the virus.

Social engineering and poor security practices

To replicate itself, a virus must be permitted to execute code and write to memory. For this reason, many viruses attach themselves to executable files that may be part of legitimate programs (see code injection). If a user attempts to launch an infected program, the virus' code may be executed simultaneously.[83] In operating systems that use file extensions to determine program associations (such as Microsoft Windows), the extensions may be hidden from the user by default. This makes it possible to create a file that is of a different type than it appears to the user. For example, an executable may be created and named "picture.png.exe", in which the user sees only "picture.png" and therefore assumes that this file is a digital image and most likely is safe, yet when opened, it runs the executable on the client machine.[84] Viruses may be installed on removable media, such as flash drives. The drives may be left in a parking lot of a government building or other target, with the hopes that curious users will insert the drive into a computer. In a 2015 experiment, researchers at the University of Michigan found that 45–98 percent of users would plug in a flash drive of unknown origin.[85]

Vulnerability of different operating systems

The vast majority of viruses target systems running Microsoft Windows. This is due to Microsoft's large market share of desktop computer users.[86] The diversity of software systems on a network limits the destructive potential of viruses and malware.[87] Open-source operating systems such as Linux allow users to choose from a variety of desktop environments, packaging tools, etc., which means that malicious code targeting any of these systems will only affect a subset of all users. Many Windows users are running the same set of applications, enabling viruses to rapidly spread among Microsoft Windows systems by targeting the same exploits on large numbers of hosts.[14][15][16][88]

While Linux and Unix in general have always natively prevented normal users from making changes to the operating system environment without permission, Windows users are generally not prevented from making these changes, meaning that viruses can easily gain control of the entire system on Windows hosts. This difference has continued partly due to the widespread use of administrator accounts in contemporary versions like Windows XP. In 1997, researchers created and released a virus for Linux—known as "Bliss".[89] Bliss, however, requires that the user run it explicitly, and it can only infect programs that the user has the access to modify. Unlike Windows users, most Unix users do not log in as an administrator, or "root user", except to install or configure software; as a result, even if a user ran the virus, it could not harm their operating system. The Bliss virus never became widespread, and remains chiefly a research curiosity. Its creator later posted the source code to Usenet, allowing researchers to see how it worked.[90]

Countermeasures

Antivirus software

Screenshot of the open source ClamWin antivirus software running in Wine on Ubuntu Linux

Many users install antivirus software that can detect and eliminate known viruses when the computer attempts to download or run the executable file (which may be distributed as an email attachment, or on USB flash drives, for example). Some antivirus software blocks known malicious websites that attempt to install malware. Antivirus software does not change the underlying capability of hosts to transmit viruses. Users must update their software regularly to patch security vulnerabilities ("holes"). Antivirus software also needs to be regularly updated to recognize the latest threats. This is because malicious hackers and other individuals are always creating new viruses. The German AV-TEST Institute publishes evaluations of antivirus software for Windows[91] and Android.[92]

Examples of Microsoft Windows anti virus and anti-malware software include the optional Microsoft Security Essentials[93] (for Windows XP, Vista and Windows 7) for real-time protection, the Windows Malicious Software Removal Tool[94] (now included with Windows (Security) Updates on "Patch Tuesday", the second Tuesday of each month), and Windows Defender (an optional download in the case of Windows XP).[95] Additionally, several capable antivirus software programs are available for free download from the Internet (usually restricted to non-commercial use).[96] Some such free programs are almost as good as commercial competitors.[97] Common security vulnerabilities are assigned CVE IDs and listed in the US National Vulnerability Database. Secunia PSI[98] is an example of software, free for personal use, that will check a PC for vulnerable out-of-date software, and attempt to update it. Ransomware and phishing scam alerts appear as press releases on the Internet Crime Complaint Center noticeboard. Ransomware is a virus that posts a message on the user's screen saying that the screen or system will remain locked or unusable until a ransom payment is made. Phishing is a deception in which the malicious individual pretends to be a friend, computer security expert, or other benevolent individual, with the goal of convincing the targeted individual to reveal passwords or other personal information.

Other commonly used preventive measures include timely operating system updates, software updates, careful Internet browsing (avoiding shady websites), and installation of only trusted software.[99] Certain browsers flag sites that have been reported to Google and that have been confirmed as hosting malware by Google.[100][101]

There are two common methods that an antivirus software application uses to detect viruses, as described in the antivirus software article. The first, and by far the most common method of virus detection is using a list of virus signature definitions. This works by examining the content of the computer's memory (its Random Access Memory (RAM), and boot sectors) and the files stored on fixed or removable drives (hard drives, floppy drives, or USB flash drives), and comparing those files against a database of known virus "signatures". Virus signatures are just strings of code that are used to identify individual viruses; for each virus, the antivirus designer tries to choose a unique signature string that will not be found in a legitimate program. Different antivirus programs use different "signatures" to identify viruses. The disadvantage of this detection method is that users are only protected from viruses that are detected by signatures in their most recent virus definition update, and not protected from new viruses (see "zero-day attack").[102]

A second method to find viruses is to use a heuristic algorithm based on common virus behaviors. This method can detect new viruses for which antivirus security firms have yet to define a "signature", but it also gives rise to more false positives than using signatures. False positives can be disruptive, especially in a commercial environment, because it may lead to a company instructing staff not to use the company computer system until IT services have checked the system for viruses. This can slow down productivity for regular workers.

Recovery strategies and methods

One may reduce the damage done by viruses by making regular backups of data (and the operating systems) on different media, that are either kept unconnected to the system (most of the time, as in a hard drive), read-only or not accessible for other reasons, such as using different file systems. This way, if data is lost through a virus, one can start again using the backup (which will hopefully be recent).[103] If a backup session on optical media like CD and DVD is closed, it becomes read-only and can no longer be affected by a virus (so long as a virus or infected file was not copied onto the CD/DVD). Likewise, an operating system on a bootable CD can be used to start the computer if the installed operating systems become unusable. Backups on removable media must be carefully inspected before restoration. The Gammima virus, for example, propagates via removable flash drives.[104][105]

Virus removal

Many websites run by antivirus software companies provide free online virus scanning, with limited "cleaning" facilities (after all, the purpose of the websites is to sell antivirus products and services). Some websites—like Google subsidiary VirusTotal.com—allow users to upload one or more suspicious files to be scanned and checked by one or more antivirus programs in one operation.[106][107] Additionally, several capable antivirus software programs are available for free download from the Internet (usually restricted to non-commercial use).[108] Microsoft offers an optional free antivirus utility called Microsoft Security Essentials, a Windows Malicious Software Removal Tool that is updated as part of the regular Windows update regime, and an older optional anti-malware (malware removal) tool Windows Defender that has been upgraded to an antivirus product in Windows 8.

Some viruses disable System Restore and other important Windows tools such as Task Manager and CMD. An example of a virus that does this is CiaDoor. Many such viruses can be removed by rebooting the computer, entering Windows "safe mode" with networking, and then using system tools or Microsoft Safety Scanner.[109] System Restore on Windows Me, Windows XP, Windows Vista and Windows 7 can restore the registry and critical system files to a previous checkpoint. Often a virus will cause a system to "hang" or "freeze", and a subsequent hard reboot will render a system restore point from the same day corrupted. Restore points from previous days should work, provided the virus is not designed to corrupt the restore files and does not exist in previous restore points.[110][111]

Operating system reinstallation

Microsoft's System File Checker (improved in Windows 7 and later) can be used to check for, and repair, corrupted system files.[112] Restoring an earlier "clean" (virus-free) copy of the entire partition from a cloned disk, a disk image, or a backup copy is one solution—restoring an earlier backup disk "image" is relatively simple to do, usually removes any malware, and may be faster than "disinfecting" the computer—or reinstalling and reconfiguring the operating system and programs from scratch, as described below, then restoring user preferences.[103] Reinstalling the operating system is another approach to virus removal. It may be possible to recover copies of essential user data by booting from a live CD, or connecting the hard drive to another computer and booting from the second computer's operating system, taking great care not to infect that computer by executing any infected programs on the original drive. The original hard drive can then be reformatted and the OS and all programs installed from original media. Once the system has been restored, precautions must be taken to avoid reinfection from any restored executable files.[113]

Viruses and the Internet

Before computer networks became widespread, most viruses spread on removable media, particularly floppy disks. In the early days of the personal computer, many users regularly exchanged information and programs on floppies. Some viruses spread by infecting programs stored on these disks, while others installed themselves into the disk boot sector, ensuring that they would be run when the user booted the computer from the disk, usually inadvertently. Personal computers of the era would attempt to boot first from a floppy if one had been left in the drive. Until floppy disks fell out of use, this was the most successful infection strategy and boot sector viruses were the most common in the "wild" for many years. Traditional computer viruses emerged in the 1980s, driven by the spread of personal computers and the resultant increase in bulletin board system (BBS), modem use, and software sharing. Bulletin board–driven software sharing contributed directly to the spread of Trojan horse programs, and viruses were written to infect popularly traded software. Shareware and bootleg software were equally common vectors for viruses on BBSs.[114][115] Viruses can increase their chances of spreading to other computers by infecting files on a network file system or a file system that is accessed by other computers.[116]

Macro viruses have become common since the mid-1990s. Most of these viruses are written in the scripting languages for Microsoft programs such as Microsoft Word and Microsoft Excel and spread throughout Microsoft Office by infecting documents and spreadsheets. Since Word and Excel were also available for Mac OS, most could also spread to Macintosh computers. Although most of these viruses did not have the ability to send infected email messages, those viruses which did take advantage of the Microsoft Outlook Component Object Model (COM) interface.[117][118] Some old versions of Microsoft Word allow macros to replicate themselves with additional blank lines. If two macro viruses simultaneously infect a document, the combination of the two, if also self-replicating, can appear as a "mating" of the two and would likely be detected as a virus unique from the "parents".[119]

A virus may also send a web address link as an instant message to all the contacts (e.g., friends and colleagues' e-mail addresses) stored on an infected machine. If the recipient, thinking the link is from a friend (a trusted source) follows the link to the website, the virus hosted at the site may be able to infect this new computer and continue propagating.[120] Viruses that spread using cross-site scripting were first reported in 2002,[121] and were academically demonstrated in 2005.[122] There have been multiple instances of the cross-site scripting viruses in the "wild", exploiting websites such as MySpace (with the Samy worm) and Yahoo!.

 

Self-replicating machine

From Wikipedia, the free encyclopedia
A simple form of machine self-replication

A self-replicating machine is a type of autonomous robot that is capable of reproducing itself autonomously using raw materials found in the environment, thus exhibiting self-replication in a way analogous to that found in nature. The concept of self-replicating machines has been advanced and examined by Homer Jacobson, Edward F. Moore, Freeman Dyson, John von Neumann and in more recent times by K. Eric Drexler in his book on nanotechnology, Engines of Creation (coining the term clanking replicator for such machines) and by Robert Freitas and Ralph Merkle in their review Kinematic Self-Replicating Machines which provided the first comprehensive analysis of the entire replicator design space. The future development of such technology is an integral part of several plans involving the mining of moons and asteroid belts for ore and other materials, the creation of lunar factories, and even the construction of solar power satellites in space. The von Neumann probe is one theoretical example of such a machine. Von Neumann also worked on what he called the universal constructor, a self-replicating machine that would be able to evolve and which he formalized in a cellular automata environment. Notably, Von Neumann's Self-Reproducing Automata scheme posited that open-ended evolution requires inherited information to be copied and passed to offspring separately from the self-replicating machine, an insight that preceded the discovery of the structure of the DNA molecule by Watson and Crick and how it is separately translated and replicated in the cell.

A self-replicating machine is an artificial self-replicating system that relies on conventional large-scale technology and automation. Although suggested more than 70 years ago no self-replicating machine has been seen until today. Certain idiosyncratic terms are occasionally found in the literature. For example, the term clanking replicator was once used by Drexler to distinguish macroscale replicating systems from the microscopic nanorobots or "assemblers" that nanotechnology may make possible, but the term is informal and is rarely used by others in popular or technical discussions. Replicators have also been called "von Neumann machines" after John von Neumann, who first rigorously studied the idea. However, the term "von Neumann machine" is less specific and also refers to a completely unrelated computer architecture that von Neumann proposed and so its use is discouraged where accuracy is important. Von Neumann himself used the term universal constructor to describe such self-replicating machines.

Historians of machine tools, even before the numerical control era, sometimes figuratively said that machine tools were a unique class of machines because they have the ability to "reproduce themselves" by copying all of their parts. Implicit in these discussions is that a human would direct the cutting processes (later planning and programming the machines), and would then assemble the parts. The same is true for RepRaps, which are another class of machines sometimes mentioned in reference to such non-autonomous "self-replication". In contrast, machines that are truly autonomously self-replicating (like biological machines) are the main subject discussed here.

History

The general concept of artificial machines capable of producing copies of themselves dates back at least several hundred years. An early reference is an anecdote regarding the philosopher René Descartes, who suggested to Queen Christina of Sweden that the human body could be regarded as a machine; she responded by pointing to a clock and ordering "see to it that it reproduces offspring." Several other variations on this anecdotal response also exist. Samuel Butler proposed in his 1872 novel Erewhon that machines were already capable of reproducing themselves but it was man who made them do so, and added that "machines which reproduce machinery do not reproduce machines after their own kind". In George Eliot's 1879 book Impressions of Theophrastus Such, a series of essays that she wrote in the character of a fictional scholar named Theophrastus, the essay "Shadows of the Coming Race" speculated about self-replicating machines, with Theophrastus asking "how do I know that they may not be ultimately made to carry, or may not in themselves evolve, conditions of self-supply, self-repair, and reproduction".

In 1802 William Paley formulated the first known teleological argument depicting machines producing other machines, suggesting that the question of who originally made a watch was rendered moot if it were demonstrated that the watch was able to manufacture a copy of itself. Scientific study of self-reproducing machines was anticipated by John Bernal as early as 1929 and by mathematicians such as Stephen Kleene who began developing recursion theory in the 1930s. Much of this latter work was motivated by interest in information processing and algorithms rather than physical implementation of such a system, however. In the course of the 1950s, suggestions of several increasingly simple mechanical systems capable of self-reproduction were made — notably by Lionel Penrose.

Von Neumann's kinematic model

A detailed conceptual proposal for a self-replicating machine was first put forward by mathematician John von Neumann in lectures delivered in 1948 and 1949, when he proposed a kinematic model of self-reproducing automata as a thought experiment. Von Neumann's concept of a physical self-replicating machine was dealt with only abstractly, with the hypothetical machine using a "sea" or stockroom of spare parts as its source of raw materials. The machine had a program stored on a memory tape that directed it to retrieve parts from this "sea" using a manipulator, assemble them into a duplicate of itself, and then copy the contents of its memory tape into the empty duplicate's. The machine was envisioned as consisting of as few as eight different types of components; four logic elements that send and receive stimuli and four mechanical elements used to provide a structural skeleton and mobility. While qualitatively sound, von Neumann was evidently dissatisfied with this model of a self-replicating machine due to the difficulty of analyzing it with mathematical rigor. He went on to instead develop an even more abstract model self-replicator based on cellular automata. His original kinematic concept remained obscure until it was popularized in a 1955 issue of Scientific American.

Von Neummann's goal for his self-reproducing automata theory, as specified in his lectures at the University of Illinois in 1949, was to design a machine whose complexity could grow automatically akin to biological organisms under natural selection. He asked what is the threshold of complexity that must be crossed for machines to be able to evolve. His answer was to design an abstract machine which, when run, would replicate itself. Notably, his design implies that open-ended evolution requires inherited information to be copied and passed to offspring separately from the self-replicating machine, an insight that preceded the discovery of the structure of the DNA molecule by Watson and Crick and how it is separately translated and replicated in the cell.

Moore's artificial living plants

In 1956 mathematician Edward F. Moore proposed the first known suggestion for a practical real-world self-replicating machine, also published in Scientific American. Moore's "artificial living plants" were proposed as machines able to use air, water and soil as sources of raw materials and to draw its energy from sunlight via a solar battery or a steam engine. He chose the seashore as an initial habitat for such machines, giving them easy access to the chemicals in seawater, and suggested that later generations of the machine could be designed to float freely on the ocean's surface as self-replicating factory barges or to be placed in barren desert terrain that was otherwise useless for industrial purposes. The self-replicators would be "harvested" for their component parts, to be used by humanity in other non-replicating machines.

Dyson's replicating systems

The next major development of the concept of self-replicating machines was a series of thought experiments proposed by physicist Freeman Dyson in his 1970 Vanuxem Lecture. He proposed three large-scale applications of machine replicators. First was to send a self-replicating system to Saturn's moon Enceladus, which in addition to producing copies of itself would also be programmed to manufacture and launch solar sail-propelled cargo spacecraft. These spacecraft would carry blocks of Enceladean ice to Mars, where they would be used to terraform the planet. His second proposal was a solar-powered factory system designed for a terrestrial desert environment, and his third was an "industrial development kit" based on this replicator that could be sold to developing countries to provide them with as much industrial capacity as desired. When Dyson revised and reprinted his lecture in 1979 he added proposals for a modified version of Moore's seagoing artificial living plants that was designed to distill and store fresh water for human use and the "Astrochicken."

Advanced Automation for Space Missions

An artist's conception of a "self-growing" robotic lunar factory

In 1980, inspired by a 1979 "New Directions Workshop" held at Wood's Hole, NASA conducted a joint summer study with ASEE entitled Advanced Automation for Space Missions to produce a detailed proposal for self-replicating factories to develop lunar resources without requiring additional launches or human workers on-site. The study was conducted at Santa Clara University and ran from June 23 to August 29, with the final report published in 1982. The proposed system would have been capable of exponentially increasing productive capacity and the design could be modified to build self-replicating probes to explore the galaxy.

The reference design included small computer-controlled electric carts running on rails inside the factory, mobile "paving machines" that used large parabolic mirrors to focus sunlight on lunar regolith to melt and sinter it into a hard surface suitable for building on, and robotic front-end loaders for strip mining. Raw lunar regolith would be refined by a variety of techniques, primarily hydrofluoric acid leaching. Large transports with a variety of manipulator arms and tools were proposed as the constructors that would put together new factories from parts and assemblies produced by its parent.

Power would be provided by a "canopy" of solar cells supported on pillars. The other machinery would be placed under the canopy.

A "casting robot" would use sculpting tools and templates to make plaster molds. Plaster was selected because the molds are easy to make, can make precise parts with good surface finishes, and the plaster can be easily recycled afterward using an oven to bake the water back out. The robot would then cast most of the parts either from nonconductive molten rock (basalt) or purified metals. A carbon dioxide laser cutting and welding system was also included.

A more speculative, more complex microchip fabricator was specified to produce the computer and electronic systems, but the designers also said that it might prove practical to ship the chips from Earth as if they were "vitamins."

A 2004 study supported by NASA's Institute for Advanced Concepts took this idea further. Some experts are beginning to consider self-replicating machines for asteroid mining.

Much of the design study was concerned with a simple, flexible chemical system for processing the ores, and the differences between the ratio of elements needed by the replicator, and the ratios available in lunar regolith. The element that most limited the growth rate was chlorine, needed to process regolith for aluminium. Chlorine is very rare in lunar regolith.

Lackner-Wendt Auxon replicators

In 1995, inspired by Dyson's 1970 suggestion of seeding uninhabited deserts on Earth with self-replicating machines for industrial development, Klaus Lackner and Christopher Wendt developed a more detailed outline for such a system. They proposed a colony of cooperating mobile robots 10–30 cm in size running on a grid of electrified ceramic tracks around stationary manufacturing equipment and fields of solar cells. Their proposal didn't include a complete analysis of the system's material requirements, but described a novel method for extracting the ten most common chemical elements found in raw desert topsoil (Na, Fe, Mg, Si, Ca, Ti, Al, C, O2 and H2) using a high-temperature carbothermic process. This proposal was popularized in Discover magazine, featuring solar-powered desalination equipment used to irrigate the desert in which the system was based. They named their machines "Auxons", from the Greek word auxein which means "to grow."

Recent work

NIAC studies on self-replicating systems

In the spirit of the 1980 "Advanced Automation for Space Missions" study, the NASA Institute for Advanced Concepts began several studies of self-replicating system design in 2002 and 2003. Four phase I grants were awarded:

Bootstrapping Self-Replicating Factories in Space

In 2012, NASA researchers Metzger, Muscatello, Mueller, and Mantovani argued for a so-called "bootstrapping approach" to start self-replicating factories in space. They developed this concept on the basis of In Situ Resource Utilization (ISRU) technologies that NASA has been developing to "live off the land" on the Moon or Mars. Their modeling showed that in just 20 to 40 years this industry could become self-sufficient then grow to large size, enabling greater exploration in space as well as providing benefits back to Earth. In 2014, Thomas Kalil of the White House Office of Science and Technology Policy published on the White House blog an interview with Metzger on bootstrapping solar system civilization through self-replicating space industry. Kalil requested the public submit ideas for how "the Administration, the private sector, philanthropists, the research community, and storytellers can further these goals." Kalil connected this concept to what former NASA Chief technologist Mason Peck has dubbed "Massless Exploration", the ability to make everything in space so that you do not need to launch it from Earth. Peck has said, "...all the mass we need to explore the solar system is already in space. It's just in the wrong shape." In 2016, Metzger argued that fully self-replicating industry can be started over several decades by astronauts at a lunar outpost for a total cost (outpost plus starting the industry) of about a third of the space budgets of the International Space Station partner nations, and that this industry would solve Earth's energy and environmental problems in addition to providing massless exploration.

New York University artificial DNA tile motifs

In 2011, a team of scientists at New York University created a structure called 'BTX' (bent triple helix) based around three double helix molecules, each made from a short strand of DNA. Treating each group of three double-helices as a code letter, they can (in principle) build up self-replicating structures that encode large quantities of information.

Self-replication of magnetic polymers

In 2001 Jarle Breivik at University of Oslo created a system of magnetic building blocks, which in response to temperature fluctuations, spontaneously form self-replicating polymers.

Self-replication of neural circuits

In 1968 Zellig Harris wrote that "the metalanguage is in the language," suggesting that self-replication is part of language. In 1977 Niklaus Wirth formalized this proposition by publishing a self-replicating deterministic context-free grammar. Adding to it probabilities, Bertrand du Castel published in 2015 a self-replicating stochastic grammar and presented a mapping of that grammar to neural networks, thereby presenting a model for a self-replicating neural circuit.

Self-replicating spacecraft

The idea of an automated spacecraft capable of constructing copies of itself was first proposed in scientific literature in 1974 by Michael A. Arbib, but the concept had appeared earlier in science fiction such as the 1967 novel Berserker by Fred Saberhagen or the 1950 novellette trilogy The Voyage of the Space Beagle by A. E. van Vogt. The first quantitative engineering analysis of a self-replicating spacecraft was published in 1980 by Robert Freitas, in which the non-replicating Project Daedalus design was modified to include all subsystems necessary for self-replication. The design's strategy was to use the probe to deliver a "seed" factory with a mass of about 443 tons to a distant site, have the seed factory replicate many copies of itself there to increase its total manufacturing capacity, and then use the resulting automated industrial complex to construct more probes with a single seed factory on board each.

Other references

  • A number of patents have been granted for self-replicating machine concepts. U.S. Patent 5,659,477 "Self reproducing fundamental fabricating machines (F-Units)" Inventor: Collins; Charles M. (Burke, Va.) (August 1997), U.S. Patent 5,764,518 " Self reproducing fundamental fabricating machine system" Inventor: Collins; Charles M. (Burke, Va.)(June 1998); and Collins' PCT patent WO 96/20453: "Method and system for self-replicating manufacturing stations" Inventors: Merkle; Ralph C. (Sunnyvale, Calif.), Parker; Eric G. (Wylie, Tex.), Skidmore; George D. (Plano, Tex.) (January 2003).
  • Macroscopic replicators are mentioned briefly in the fourth chapter of K. Eric Drexler's 1986 book Engines of Creation.
  • In 1995, Nick Szabo proposed a challenge to build a macroscale replicator from Lego robot kits and similar basic parts. Szabo wrote that this approach was easier than previous proposals for macroscale replicators, but successfully predicted that even this method would not lead to a macroscale replicator within ten years.
  • In 2004, Robert Freitas and Ralph Merkle published the first comprehensive review of the field of self-replication (from which much of the material in this article is derived, with permission of the authors), in their book Kinematic Self-Replicating Machines, which includes 3000+ literature references. This book included a new molecular assembler design, a primer on the mathematics of replication, and the first comprehensive analysis of the entire replicator design space.

Prospects for implementation

As the use of industrial automation has expanded over time, some factories have begun to approach a semblance of self-sufficiency that is suggestive of self-replicating machines. However, such factories are unlikely to achieve "full closure" until the cost and flexibility of automated machinery comes close to that of human labour and the manufacture of spare parts and other components locally becomes more economical than transporting them from elsewhere. As Samuel Butler has pointed out in Erewhon, replication of partially closed universal machine tool factories is already possible. Since safety is a primary goal of all legislative consideration of regulation of such development, future development efforts may be limited to systems which lack either control, matter, or energy closure. Fully capable machine replicators are most useful for developing resources in dangerous environments which are not easily reached by existing transportation systems (such as outer space).

An artificial replicator can be considered to be a form of artificial life. Depending on its design, it might be subject to evolution over an extended period of time. However, with robust error correction, and the possibility of external intervention, the common science fiction scenario of robotic life run amok will remain extremely unlikely for the foreseeable future.


 

Evolutionary algorithm

In computational intelligence (CI), an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses mechanisms inspired by biological evolution, such as reproduction, mutation, recombination, and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the quality of the solutions (see also loss function). Evolution of the population then takes place after the repeated application of the above operators.

Evolutionary algorithms often perform well approximating solutions to all types of problems because they ideally do not make any assumption about the underlying fitness landscape. Techniques from evolutionary algorithms applied to the modeling of biological evolution are generally limited to explorations of microevolutionary processes and planning models based upon cellular processes. In most real applications of EAs, computational complexity is a prohibiting factor. In fact, this computational complexity is due to fitness function evaluation. Fitness approximation is one of the solutions to overcome this difficulty. However, seemingly simple EA can solve often complex problems; therefore, there may be no direct link between algorithm complexity and problem complexity.

Implementation

The following is an example of a generic single-objective genetic algorithm.

Step One: Generate the initial population of individuals randomly. (First generation)

Step Two: Repeat the following regenerational steps until termination:

  1. Evaluate the fitness of each individual in the population (time limit, sufficient fitness achieved, etc.)
  2. Select the fittest individuals for reproduction. (Parents)
  3. Breed new individuals through crossover and mutation operations to give birth to offspring.
  4. Replace the least-fit individuals of the population with new individuals.

Types

Similar techniques differ in genetic representation and other implementation details, and the nature of the particular applied problem.

  • Genetic algorithm – This is the most popular type of EA. One seeks the solution of a problem in the form of strings of numbers (traditionally binary, although the best representations are usually those that reflect something about the problem being solved), by applying operators such as recombination and mutation (sometimes one, sometimes both). This type of EA is often used in optimization problems.
  • Genetic programming – Here the solutions are in the form of computer programs, and their fitness is determined by their ability to solve a computational problem.
  • Evolutionary programming – Similar to genetic programming, but the structure of the program is fixed and its numerical parameters are allowed to evolve.
  • Gene expression programming – Like genetic programming, GEP also evolves computer programs but it explores a genotype-phenotype system, where computer programs of different sizes are encoded in linear chromosomes of fixed length.
  • Evolution strategy – Works with vectors of real numbers as representations of solutions, and typically uses self-adaptive mutation rates.
  • Differential evolution – Based on vector differences and is therefore primarily suited for numerical optimization problems.
  • Neuroevolution – Similar to genetic programming but the genomes represent artificial neural networks by describing structure and connection weights. The genome encoding can be direct or indirect.
  • Learning classifier system – Here the solution is a set of classifiers (rules or conditions). A Michigan-LCS evolves at the level of individual classifiers whereas a Pittsburgh-LCS uses populations of classifier-sets. Initially, classifiers were only binary, but now include real, neural net, or S-expression types. Fitness is typically determined with either a strength or accuracy based reinforcement learning or supervised learning approach.

Comparison to biological processes

A possible limitation of many evolutionary algorithms is their lack of a clear genotype-phenotype distinction. In nature, the fertilized egg cell undergoes a complex process known as embryogenesis to become a mature phenotype. This indirect encoding is believed to make the genetic search more robust (i.e. reduce the probability of fatal mutations), and also may improve the evolvability of the organism. Such indirect (also known as generative or developmental) encodings also enable evolution to exploit the regularity in the environment. Recent work in the field of artificial embryogeny, or artificial developmental systems, seeks to address these concerns. And gene expression programming successfully explores a genotype-phenotype system, where the genotype consists of linear multigenic chromosomes of fixed length and the phenotype consists of multiple expression trees or computer programs of different sizes and shapes.

Related techniques

Swarm algorithms include

Other population-based metaheuristic methods

  • Hunting Search – A method inspired by the group hunting of some animals such as wolves that organize their position to surround the prey, each of them relative to the position of the others and especially that of their leader. It is a continuous optimization method adapted as a combinatorial optimization method.
  • Adaptive dimensional search – Unlike nature-inspired metaheuristic techniques, an adaptive dimensional search algorithm does not implement any metaphor as an underlying principle. Rather it uses a simple performance-oriented method, based on the update of the search dimensionality ratio (SDR) parameter at each iteration.
  • Firefly algorithm is inspired by the behavior of fireflies, attracting each other by flashing light. This is especially useful for multimodal optimization.
  • Harmony search – Based on the ideas of musicians' behavior in searching for better harmonies. This algorithm is suitable for combinatorial optimization as well as parameter optimization.
  • Gaussian adaptation – Based on information theory. Used for maximization of manufacturing yield, mean fitness or average information. See for instance Entropy in thermodynamics and information theory.
  • Memetic algorithm – A hybrid method, inspired by Richard Dawkins's notion of a meme, it commonly takes the form of a population-based algorithm coupled with individual learning procedures capable of performing local refinements. Emphasizes the exploitation of problem-specific knowledge, and tries to orchestrate local and global search in a synergistic way.

Examples

In 2020, Google stated that their AutoML-Zero can successfully rediscover classic algorithms such as the concept of neural networks.

The computer simulations Tierra and Avida attempt to model macroevolutionary dynamics.

Emic and etic

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Emic_and_etic In anthropology , folk...