Altova XMLSpy 2024 Professional Edition

Icons and shortcuts

 

Command

Icon

Shortcut

Find

icFind

Ctrl+F

Find Next

icFindNext

F3

 

Find

The Find command displays the Find/Replace dialog (see screenshot below), in which you can specify the string you want to find and other options for the search. To find text, enter the text in the Find field or use the combo box to select from one of the last 10 search criteria, and then specify the options for the search.

 

The Find command can also be used to find file and folder names when a project is selected in the Project window.

 

Find Next

The Find Next command repeats the last Find command. It searches for the next occurrence of the input text.

 

The Find Next command can also be used to find file and folder names when a project is selected in the Project window.

 

Find/Replace dialog

The Find/Replace dialog described below appears in Text View and Grid View. The Find options can be specified via buttons located below the search term field (see screenshot below). When an option is toggled on, its button color changes to blue (see the first (casing) option in the screenshot below).

TextViewFindDialog

You can select from the following options:

 

Match case: Case-sensitive search when toggled on (Address is not the same as address).

Match whole word: Only the exact words in the text will be matched. For example, for the input string fit, with Match whole word toggled on, only the word fit will match the search string; the fit in fitness, for example, will not be matched.

Regular expression: If toggled on, the search term will be read as a regular expression. See Regular expressions below for a description of how regular expressions are used.

Filter results: Select one or more document components where the search is to be carried out.

Find anchor: Found items are indexed in document order and the index of the currently selected item is given in the Find dialog. For example, from the information in the screenshot above, we can tell that the second found item from four is currently selected. Clicking Find Next (highlighted at bottom right in the screenshot) takes you to the next found item in index order. However, if the Find Anchor option is selected, Find Next takes you to the next found item relative to the current cursor position. So, if the currently selected item is the first (say, 1 of 4) and you were to place the cursor after item 3, then Find Next would take you to item 4—and not to item 2 (as would have happened if Find Anchor was toggled off).

Find in selection: When toggled on, locks the current text selection and restricts the search to the selection. Otherwise, the entire document is searched. Before selecting a new range of text, unlock the current selection by toggling off the Find in Selection option.

 

Regular expressions

You can use regular expressions (regex) to find a text string. To do this, first, switch the Regular expression option on (see Find options above). This specifies that the text in the search term field is to be evaluated as a regular expression. Next, enter the regular expression in the search term field. For help with building a regular expression, click the Regular Expression Builder button, which is located to the right of the search term field (see screenshot below). Click an item in the Builder to enter the corresponding regex metacharacter/s in the search term field. The screenshot below shows a simple regular expression to find email addresses. For a brief description of metacharacters, see the section Regular expression metacharacters below.

Click to expand/collapse

Regular expression metacharacters

Given below is a list of regular expression metacharacters.

 

.

Matches any character. This is a placeholder for a single character.

(

Marks the start of a tagged expression.

)

Marks the end of a tagged expression.

(abc)

The ( and ) metacharacters mark the start and end of a tagged expression. Tagged expressions may be useful when you need to tag ("remember") a matched region for the purpose of referring to it later (back-reference). Up to nine expressions can be tagged (and then back-referenced later, either in the Find or Replace field).

 

For example, (the) \1 matches the string the the. This expression can be literally explained as follows: match the string "the" (and remember it as a tagged region), followed by a space character, followed by a back-reference to the tagged region matched previously.

\n

Where n is a variable that can take integer values from 1 through 9. The expression refers to the first through ninth tagged region when replacing. For example, if the find string is Fred([1-9])XXX and the replace string is Sam\1YYY, this means that in the find string there is one tagged expression that is (implicitly) indexed with the number 1; in the replace string, the tagged expression is referenced with \1. If the find-replace command is applied to Fred2XXX, it would generate Sam2YYY.

\<

Matches the start of a word.

\>

Matches the end of a word.

\x

Allows you to use a character x, which would otherwise have a special meaning. For example, \[ would be interpreted as [ and not as the start of a character set.

[...]

Indicates a set of characters. For example, [abc] means any of the characters a, b or c. You can also use ranges: for example [a-z] for any lower case character.

[^...]

The complement of the characters in the set. For example, [^A-Za-z] means any character except an alphabetic character.

^

Matches the start of a line (unless used inside a set, see above).

$

Matches the end of a line. Example: A+$ to find one or more A's at end of line.

*

Matches 0 or more times. For example, Sa*m matches Sm, Sam, Saam, Saaam and so on.

+

Matches 1 or more times. For example, Sa+m matches Sam, Saam, Saaam and so on.

 

 

Representation of special characters

Note the following expressions.

 

\r

Carriage Return (CR). You can use either CR (\r) or LF (\n) to find or create a new line

\n

Line Feed (LF). You can use either CR (\r) or LF (\n) to find or create a new line

\t

Tab character

\\

Use this to escape characters that appear in regex expression, for example: \\\n

 

Regular expression examples

This example illustrates how to find and replace text using regular expressions. In many cases, finding and replacing text is straightforward and does not require regular expressions at all. However, there may be instances where you need to manipulate text in a way that cannot be done with a standard find and replace operation. Consider, for example, that you have an XML file of several thousand lines where you need to rename certain elements in one operation, without affecting the content enclosed within them. Another example: you need to change the order of multiple attributes of an element. This is where regular expressions can help you, by eliminating a lot of work which would otherwise need to be done manually.

 

Example 1: Renaming elements

The sample XML code listing below contains a list of books. Let's suppose your goal is to replace the <Category> element of each book to <Genre>. One of the ways to achieve this goal is by using regular expressions.

 

<?xml version="1.0" encoding="UTF-8"?>
<books xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="books.xsd">
  <book id="1">
    <author>Mark Twain</author>
    <title>The Adventures of Tom Sawyer</title>
    <category>Fiction</category>
    <year>1876</year>
  </book>
  <book id="2">
    <author>Franz Kafka</author>
    <title>The Metamorphosis</title>
    <category>Fiction</category>
    <year>1912</year>
  </book>
  <book id="3">
    <author>Herman Melville</author>
    <title>Moby Dick</title>
    <category>Fiction</category>
    <year>1851</year>
  </book>
</books>

 

To solve the requirement, follow the steps below:

 

1.Press Ctrl+H to open the Find and Replace dialog box.

2.Click Use regular expressions _ic_find_regex.

3.In the Find field, enter the following text: <category>(.+)</category> . This regular expression matches all category elements, and they become highlighted.

inc-RegexExample01

To match the inner text of each element (which is not known in advance), we used the tagged expression (.+) . The tagged expression (.+) means "match one or more occurrences of any character, that is .+ , and remember this match". As shown in the next step, we will need the reference to the tagged expression later.

 

4.In the Replace field, enter the following text: <genre>\1</genre> . This regular expression defines the replacement text. Notice it uses a back-reference \1 to the previously tagged expression from the Find field. In other words, \1 in this context means "the inner text of the currently matched <category> element".

5.Click Replace All _ic_regex_replaceall and observe the results. All category elements have now been renamed to genre, which was the intended goal.

 

Example 2: Changing the order of attributes

The sample XML code listing below contains a list of products. Each product element has two attributes: id and a size. Let's suppose your goal is to change the order of id and size attributes in each product element (in other words, the size attribute should come before id). One of the ways to solve this requirement is by using regular expressions.

 

<?xml version="1.0" encoding="UTF-8"?>
<products xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="products.xsd">
  <product id="1" size="10"/>
  <product id="2" size="20"/>
  <product id="3" size="30"/>
  <product id="4" size="40"/>
  <product id="5" size="50"/>
  <product id="6" size="60"/>
</products>

 

To solve the requirement, follow the steps below:

 

1.Press Ctrl+H to open the Find and Replace dialog box.

2.Click Use regular expressions _ic_find_regex.

3.In the Find field, enter the following: <product id="(.+)" size="(.+)"/> . This regular expression matches a product element in the XML document. Notice that, in order to match the value of each attribute (which is not known in advance), a tagged expression (.+) is used twice. The tagged expression (.+) matches the value of each attribute (assumed to be one or more occurrences of any character, that is .+ ).

4.In the Replace field, enter the following: <product size="\2" id="\1"/> . This regular expression contains the replacement text for each matched product element. Notice that it uses two references \1 and \2 . These correspond to the tagged expressions from the Find field. In other words, \1 means "the value of attribute id" and \2 means "the value of attribute size".

inc-RegexExample02

6.Click Replace All _ic_regex_replaceall and observe the results. All product elements have now been updated so that attribute size comes before attribute id.

 

 

© 2018-2024 Altova GmbH