Next: Replace, Previous: Regexp Example, Up: Search [Contents][Index]
Normally, you’d want search commands to disregard certain minor differences between the search string you type and the text being searched. For example, sequences of whitespace characters of different length are usually perceived as equivalent; letter-case differences usually don’t matter; etc. This is known as character equivalence.
This section describes the Emacs lax search features, and how to tailor them to your needs.
By default, search commands perform lax space matching:
each space, or sequence of spaces, matches any sequence of one or more
whitespace characters in the text. (Incremental regexp search has a
separate default; see Regexp Search.) Hence, ‘foo bar’
matches ‘foo bar’, ‘foo bar’, ‘foo bar’, and
so on (but not ‘foobar’). More precisely, Emacs matches each
sequence of space characters in the search string to a regular
expression specified by the variable search-whitespace-regexp
.
For example, to make spaces match sequences of newlines as well as
spaces, set it to ‘"[[:space:]\n]+"’. The default value of this
variable depends on the buffer’s major mode; most major modes classify
spaces, tabs, and formfeed characters as whitespace.
If you want whitespace characters to match exactly, you can turn lax
space matching off by typing M-s SPC
(isearch-toggle-lax-whitespace
) within an incremental search.
Another M-s SPC turns lax space matching back on. To
disable lax whitespace matching for all searches, change
search-whitespace-regexp
to nil
; then each space in the
search string matches exactly one space.
Searches in Emacs by default ignore the case of the text they are searching through, if you specify the search string in lower case. Thus, if you specify searching for ‘foo’, then ‘Foo’ and ‘foo’ also match. Regexps, and in particular character sets, behave likewise: ‘[ab]’ matches ‘a’ or ‘A’ or ‘b’ or ‘B’. This feature is known as case folding, and it is supported in both incremental and non-incremental search modes.
An upper-case letter anywhere in the search string makes the search
case-sensitive. Thus, searching for ‘Foo’ does not find
‘foo’ or ‘FOO’. This applies to regular expression search
as well as to literal string search. The effect ceases if you delete
the upper-case letter from the search string. The variable
search-upper-case
controls this: if it is non-nil
(the
default), an upper-case character in the search string make the search
case-sensitive; setting it to nil
disables this effect of
upper-case characters.
If you set the variable case-fold-search
to nil
, then
all letters must match exactly, including case. This is a per-buffer
variable; altering the variable normally affects only the current buffer,
unless you change its default value. See Locals.
This variable applies to nonincremental searches also, including those
performed by the replace commands (see Replace) and the minibuffer
history matching commands (see Minibuffer History).
Typing M-c or M-s c (isearch-toggle-case-fold
)
within an incremental search toggles the case sensitivity of that
search. The effect does not extend beyond the current incremental
search, but it does override the effect of adding or removing an
upper-case letter in the current search.
Several related variables control case-sensitivity of searching and
matching for specific commands or activities. For instance,
tags-case-fold-search
controls case sensitivity for
find-tag
. To find these variables, do M-x
apropos-variable RET case-fold-search RET.
Case folding disregards case distinctions among characters, making
upper-case characters match lower-case variants, and vice versa. A
generalization of case folding is character folding, which
disregards wider classes of distinctions among similar characters.
For instance, under character folding the letter a
matches all
of its accented cousins like ä
and á
, i.e., the
match disregards the diacritics that distinguish these
variants. In addition, a
matches other characters that
resemble it, or have it as part of their graphical representation,
such as U+249C PARENTHESIZED LATIN SMALL LETTER A and U+2100
ACCOUNT OF (which looks like a small a
over c
).
Similarly, the ASCII double-quote character "
matches
all the other variants of double quotes defined by the Unicode
standard. Finally, character folding can make a sequence of one or
more characters match another sequence of a different length: for
example, the sequence of two characters ff
matches U+FB00
LATIN SMALL LIGATURE FF. Character sequences that are not identical,
but match under character folding are known as equivalent
character sequences.
Generally, search commands in Emacs do not by default perform
character folding in order to match equivalent character sequences.
You can enable this behavior by customizing the variable
search-default-mode
to char-fold-to-regexp
.
See Search Customizations. Within an incremental search, typing
M-s ' (isearch-toggle-char-fold
) toggles character
folding, but only for that search. (Replace commands have a different
default, controlled by a separate option; see Replacement and Lax Matches.)
Like with case folding, typing an explicit variant of a character,
such as ä
, as part of the search string disables character
folding for that search. If you delete such a character from the
search string, this effect ceases.
Next: Replace, Previous: Regexp Example, Up: Search [Contents][Index]