Nota:
El acceso a esta página requiere autorización. Puede intentar iniciar sesión o cambiar directorios.
El acceso a esta página requiere autorización. Puede intentar cambiar los directorios.
En este tema se amplían las consultas vistas anteriormente para conseguir algo muy útil: encontrar todas las apariciones de una cierta cadena en el documento.
Ejemplo
Este ejemplo procesa un documento WordprocessingML, con el objetivo de encontrar todas las apariciones de una porción de texto en particular dentro del documento.Para ello, utilizaremos una consulta que encuentre la cadena "Hello".Este ejemplo se basa en los ejemplos anteriormente vistos en este tutorial.En los comentarios del siguiente código se menciona dónde se encuentra la nueva consulta.
Si desea obtener instrucciones acerca de cómo crear el documento de origen utilizado en este ejemplo, vea Crear el documento XML abierto de Office de origen.
Este ejemplo utiliza las clases que se encuentran en el ensamblado WindowsBase.Utiliza los tipos del espacio de nombres System.IO.Packaging.
public static class LocalExtensions
{
public static string StringConcatenate(this IEnumerable<string> source)
{
StringBuilder sb = new StringBuilder();
foreach (string s in source)
sb.Append(s);
return sb.ToString();
}
public static string StringConcatenate<T>(this IEnumerable<T> source,
Func<T, string> func)
{
StringBuilder sb = new StringBuilder();
foreach (T item in source)
sb.Append(func(item));
return sb.ToString();
}
public static string StringConcatenate(this IEnumerable<string> source, string separator)
{
StringBuilder sb = new StringBuilder();
foreach (string s in source)
sb.Append(s).Append(separator);
return sb.ToString();
}
public static string StringConcatenate<T>(this IEnumerable<T> source,
Func<T, string> func, string separator)
{
StringBuilder sb = new StringBuilder();
foreach (T item in source)
sb.Append(func(item)).Append(separator);
return sb.ToString();
}
}
class Program
{
public static string ParagraphText(XElement e)
{
XNamespace w = e.Name.Namespace;
return e
.Elements(w + "r")
.Elements(w + "t")
.StringConcatenate(element => (string)element);
}
static void Main(string[] args)
{
const string fileName = "SampleDoc.docx";
const string documentRelationshipType =
"https://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument";
const string stylesRelationshipType =
"https://schemas.openxmlformats.org/officeDocument/2006/relationships/styles";
const string wordmlNamespace =
"https://schemas.openxmlformats.org/wordprocessingml/2006/main";
XNamespace w = wordmlNamespace;
XDocument xDoc = null;
XDocument styleDoc = null;
using (Package wdPackage = Package.Open(fileName, FileMode.Open, FileAccess.Read))
{
PackageRelationship docPackageRelationship =
wdPackage.GetRelationshipsByType(documentRelationshipType).FirstOrDefault();
if (docPackageRelationship != null)
{
Uri documentUri = PackUriHelper.ResolvePartUri(new Uri("/", UriKind.Relative),
docPackageRelationship.TargetUri);
PackagePart documentPart = wdPackage.GetPart(documentUri);
// Load the document XML in the part into an XDocument instance.
xDoc = XDocument.Load(XmlReader.Create(documentPart.GetStream()));
// Find the styles part. There will only be one.
PackageRelationship styleRelation =
documentPart.GetRelationshipsByType(stylesRelationshipType).FirstOrDefault();
if (styleRelation != null)
{
Uri styleUri =
PackUriHelper.ResolvePartUri(documentUri, styleRelation.TargetUri);
PackagePart stylePart = wdPackage.GetPart(styleUri);
// Load the style XML in the part into an XDocument instance.
styleDoc = XDocument.Load(XmlReader.Create(stylePart.GetStream()));
}
}
}
string defaultStyle =
(string)(
from style in styleDoc.Root.Elements(w + "style")
where (string)style.Attribute(w + "type") == "paragraph" &&
(string)style.Attribute(w + "default") == "1"
select style
).First().Attribute(w + "styleId");
// Find all paragraphs in the document.
var paragraphs =
from para in xDoc
.Root
.Element(w + "body")
.Descendants(w + "p")
let styleNode = para
.Elements(w + "pPr")
.Elements(w + "pStyle")
.FirstOrDefault()
select new
{
ParagraphNode = para,
StyleName = styleNode != null ?
(string)styleNode.Attribute(w + "val") :
defaultStyle
};
// Retrieve the text of each paragraph.
var paraWithText =
from para in paragraphs
select new
{
ParagraphNode = para.ParagraphNode,
StyleName = para.StyleName,
Text = ParagraphText(para.ParagraphNode)
};
// Following is the new query that retrieves all paragraphs
// that have specific text in them.
var helloParagraphs =
from para in paraWithText
where para.Text.Contains("Hello")
select new
{
ParagraphNode = para.ParagraphNode,
StyleName = para.StyleName,
Text = para.Text
};
foreach (var p in helloParagraphs)
Console.WriteLine("StyleName:{0} >{1}<", p.StyleName, p.Text);
}
}
Imports <xmlns:w="https://schemas.openxmlformats.org/wordprocessingml/2006/main">
Module Module1
<System.Runtime.CompilerServices.Extension()> _
Public Function StringConcatenate(ByVal source As IEnumerable(Of String)) As String
Dim sb As StringBuilder = New StringBuilder()
For Each s As String In source
sb.Append(s)
Next
Return sb.ToString()
End Function
<System.Runtime.CompilerServices.Extension()> _
Public Function StringConcatenate(Of T)(ByVal source As IEnumerable(Of T), _
ByVal func As Func(Of T, String)) As String
Dim sb As StringBuilder = New StringBuilder()
For Each item As T In source
sb.Append(func(item))
Next
Return sb.ToString()
End Function
<System.Runtime.CompilerServices.Extension()> _
Public Function StringConcatenate(Of T)(ByVal source As IEnumerable(Of T), _
ByVal separator As String) As String
Dim sb As StringBuilder = New StringBuilder()
For Each s As T In source
sb.Append(s).Append(separator)
Next
Return sb.ToString()
End Function
<System.Runtime.CompilerServices.Extension()> _
Public Function StringConcatenate(Of T)(ByVal source As IEnumerable(Of T), _
ByVal func As Func(Of T, String), ByVal separator As String) As String
Dim sb As StringBuilder = New StringBuilder()
For Each item As T In source
sb.Append(func(item)).Append(separator)
Next
Return sb.ToString()
End Function
Public Function ParagraphText(ByVal e As XElement) As String
Dim w As XNamespace = e.Name.Namespace
Return (e.<w:r>.<w:t>).StringConcatenate(Function(element) CStr(element))
End Function
' Following function is required because VB does not support short circuit evaluation
Private Function GetStyleOfParagraph(ByVal styleNode As XElement, ByVal defaultStyle As String) As String
If (styleNode Is Nothing) Then
Return defaultStyle
Else
Return styleNode.@w:val
End If
End Function
Sub Main()
Dim fileName = "SampleDoc.docx"
Dim documentRelationshipType = "https://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument"
Dim stylesRelationshipType = "https://schemas.openxmlformats.org/officeDocument/2006/relationships/styles"
Dim wordmlNamespace = "https://schemas.openxmlformats.org/wordprocessingml/2006/main"
Dim xDoc As XDocument = Nothing
Dim styleDoc As XDocument = Nothing
Using wdPackage As Package = Package.Open(fileName, FileMode.Open, FileAccess.Read)
Dim docPackageRelationship As PackageRelationship = wdPackage.GetRelationshipsByType(documentRelationshipType).FirstOrDefault()
If (docPackageRelationship IsNot Nothing) Then
Dim documentUri As Uri = PackUriHelper.ResolvePartUri(New Uri("/", UriKind.Relative), docPackageRelationship.TargetUri)
Dim documentPart As PackagePart = wdPackage.GetPart(documentUri)
' Load the document XML in the part into an XDocument instance.
xDoc = XDocument.Load(XmlReader.Create(documentPart.GetStream()))
' Find the styles part. There will only be one.
Dim styleRelation As PackageRelationship = documentPart.GetRelationshipsByType(stylesRelationshipType).FirstOrDefault()
If (styleRelation IsNot Nothing) Then
Dim styleUri As Uri = PackUriHelper.ResolvePartUri(documentUri, styleRelation.TargetUri)
Dim stylePart As PackagePart = wdPackage.GetPart(styleUri)
' Load the style XML in the part into an XDocument instance.
styleDoc = XDocument.Load(XmlReader.Create(stylePart.GetStream()))
End If
End If
End Using
Dim defaultStyle As String = _
( _
From style In styleDoc.Root.<w:style> _
Where style.@w:type = "paragraph" And _
style.@w:default = "1" _
Select style _
).First().@w:styleId
' Find all paragraphs in the document.
Dim paragraphs = _
From para In xDoc.Root.<w:body>...<w:p> _
Let styleNode As XElement = para.<w:pPr>.<w:pStyle>.FirstOrDefault _
Select New With { _
.ParagraphNode = para, _
.StyleName = GetStyleOfParagraph(styleNode, defaultStyle) _
}
' Retrieve the text of each paragraph.
Dim paraWithText = _
From para In paragraphs _
Select New With { _
.ParagraphNode = para.ParagraphNode, _
.StyleName = para.StyleName, _
.Text = ParagraphText(para.ParagraphNode) _
}
' Following is the new query that retrieves all paragraphs
' that have specific text in them.
Dim helloParagraphs = _
From para In paraWithText _
Where para.Text.Contains("Hello") _
Select New With _
{ _
.ParagraphNode = para.ParagraphNode, _
.StyleName = para.StyleName, _
.Text = para.Text _
}
For Each p In helloParagraphs
Console.WriteLine("StyleName:{0} >{1}<", p.StyleName, p.Text)
Next
End Sub
End Module
Este ejemplo genera el siguiente resultado:
StyleName:Code > Console.WriteLine("Hello World");<
StyleName:Code >Hello World<
Por supuesto, puede modificar la búsqueda de forma que encuentre aquellas líneas con un estilo en particular.La siguiente consulta encuentra todas las líneas en blanco que tienen el estilo Code:
public static class LocalExtensions
{
public static string StringConcatenate(this IEnumerable<string> source)
{
StringBuilder sb = new StringBuilder();
foreach (string s in source)
sb.Append(s);
return sb.ToString();
}
public static string StringConcatenate<T>(this IEnumerable<T> source,
Func<T, string> func)
{
StringBuilder sb = new StringBuilder();
foreach (T item in source)
sb.Append(func(item));
return sb.ToString();
}
public static string StringConcatenate(this IEnumerable<string> source, string separator)
{
StringBuilder sb = new StringBuilder();
foreach (string s in source)
sb.Append(s).Append(separator);
return sb.ToString();
}
public static string StringConcatenate<T>(this IEnumerable<T> source,
Func<T, string> func, string separator)
{
StringBuilder sb = new StringBuilder();
foreach (T item in source)
sb.Append(func(item)).Append(separator);
return sb.ToString();
}
}
class Program
{
public static string ParagraphText(XElement e)
{
XNamespace w = e.Name.Namespace;
return e
.Elements(w + "r")
.Elements(w + "t")
.StringConcatenate(element => (string)element);
}
static void Main(string[] args)
{
const string fileName = "SampleDoc.docx";
const string documentRelationshipType = "https://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument";
const string stylesRelationshipType = "https://schemas.openxmlformats.org/officeDocument/2006/relationships/styles";
const string wordmlNamespace = "https://schemas.openxmlformats.org/wordprocessingml/2006/main";
XNamespace w = wordmlNamespace;
XDocument xDoc = null;
XDocument styleDoc = null;
using (Package wdPackage = Package.Open(fileName, FileMode.Open, FileAccess.Read))
{
PackageRelationship docPackageRelationship = wdPackage.GetRelationshipsByType(documentRelationshipType).FirstOrDefault();
if (docPackageRelationship != null)
{
Uri documentUri = PackUriHelper.ResolvePartUri(new Uri("/", UriKind.Relative), docPackageRelationship.TargetUri);
PackagePart documentPart = wdPackage.GetPart(documentUri);
// Load the document XML in the part into an XDocument instance.
xDoc = XDocument.Load(XmlReader.Create(documentPart.GetStream()));
// Find the styles part. There will only be one.
PackageRelationship styleRelation = documentPart.GetRelationshipsByType(stylesRelationshipType).FirstOrDefault();
if (styleRelation != null)
{
Uri styleUri = PackUriHelper.ResolvePartUri(documentUri, styleRelation.TargetUri);
PackagePart stylePart = wdPackage.GetPart(styleUri);
// Load the style XML in the part into an XDocument instance.
styleDoc = XDocument.Load(XmlReader.Create(stylePart.GetStream()));
}
}
}
string defaultStyle =
(string)(
from style in styleDoc.Root.Elements(w + "style")
where (string)style.Attribute(w + "type") == "paragraph" &&
(string)style.Attribute(w + "default") == "1"
select style
).First().Attribute(w + "styleId");
// Find all paragraphs in the document.
var paragraphs =
from para in xDoc
.Root
.Element(w + "body")
.Descendants(w + "p")
let styleNode = para
.Elements(w + "pPr")
.Elements(w + "pStyle")
.FirstOrDefault()
select new
{
ParagraphNode = para,
StyleName = styleNode != null ?
(string)styleNode.Attribute(w + "val") :
defaultStyle
};
// Retrieve the text of each paragraph.
var paraWithText =
from para in paragraphs
select new
{
ParagraphNode = para.ParagraphNode,
StyleName = para.StyleName,
Text = ParagraphText(para.ParagraphNode)
};
// Retrieve all paragraphs that have no text and are styled Code.
var blankCodeParagraphs =
from para in paraWithText
where para.Text == "" && para.StyleName == "Code"
select new
{
ParagraphNode = para.ParagraphNode,
StyleName = para.StyleName,
Text = para.Text
};
foreach (var p in blankCodeParagraphs)
Console.WriteLine("StyleName:{0} >{1}<", p.StyleName, p.Text);
}
}
Imports System.IO.Packaging
Imports <xmlns:w="https://schemas.openxmlformats.org/wordprocessingml/2006/main">
Module Module1
<System.Runtime.CompilerServices.Extension()> _
Public Function StringConcatenate(ByVal source As IEnumerable(Of String)) As String
Dim sb As StringBuilder = New StringBuilder()
For Each s As String In source
sb.Append(s)
Next
Return sb.ToString()
End Function
<System.Runtime.CompilerServices.Extension()> _
Public Function StringConcatenate(Of T)(ByVal source As IEnumerable(Of T), _
ByVal func As Func(Of T, String)) As String
Dim sb As StringBuilder = New StringBuilder()
For Each item As T In source
sb.Append(func(item))
Next
Return sb.ToString()
End Function
<System.Runtime.CompilerServices.Extension()> _
Public Function StringConcatenate(Of T)(ByVal source As IEnumerable(Of T), _
ByVal separator As String) As String
Dim sb As StringBuilder = New StringBuilder()
For Each s As T In source
sb.Append(s).Append(separator)
Next
Return sb.ToString()
End Function
<System.Runtime.CompilerServices.Extension()> _
Public Function StringConcatenate(Of T)(ByVal source As IEnumerable(Of T), _
ByVal func As Func(Of T, String), ByVal separator As String) As String
Dim sb As StringBuilder = New StringBuilder()
For Each item As T In source
sb.Append(func(item)).Append(separator)
Next
Return sb.ToString()
End Function
Public Function ParagraphText(ByVal e As XElement) As String
Dim w As XNamespace = e.Name.Namespace
Return (e.<w:r>.<w:t>).StringConcatenate(Function(element) CStr(element))
End Function
' Following function is required because VB does not support short circuit evaluation
Private Function GetStyleOfParagraph(ByVal styleNode As XElement, ByVal defaultStyle As String) As String
If (styleNode Is Nothing) Then
Return defaultStyle
Else
Return styleNode.@w:val
End If
End Function
Sub Main()
Dim fileName = "SampleDoc.docx"
Dim documentRelationshipType = "https://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument"
Dim stylesRelationshipType = "https://schemas.openxmlformats.org/officeDocument/2006/relationships/styles"
Dim wordmlNamespace = "https://schemas.openxmlformats.org/wordprocessingml/2006/main"
Dim xDoc As XDocument = Nothing
Dim styleDoc As XDocument = Nothing
Using wdPackage As Package = Package.Open(fileName, FileMode.Open, FileAccess.Read)
Dim docPackageRelationship As PackageRelationship = wdPackage.GetRelationshipsByType(documentRelationshipType).FirstOrDefault()
If (docPackageRelationship IsNot Nothing) Then
Dim documentUri As Uri = PackUriHelper.ResolvePartUri(New Uri("/", UriKind.Relative), docPackageRelationship.TargetUri)
Dim documentPart As PackagePart = wdPackage.GetPart(documentUri)
' Load the document XML in the part into an XDocument instance.
xDoc = XDocument.Load(XmlReader.Create(documentPart.GetStream()))
' Find the styles part. There will only be one.
Dim styleRelation As PackageRelationship = documentPart.GetRelationshipsByType(stylesRelationshipType).FirstOrDefault()
If (styleRelation IsNot Nothing) Then
Dim styleUri As Uri = PackUriHelper.ResolvePartUri(documentUri, styleRelation.TargetUri)
Dim stylePart As PackagePart = wdPackage.GetPart(styleUri)
' Load the style XML in the part into an XDocument instance.
styleDoc = XDocument.Load(XmlReader.Create(stylePart.GetStream()))
End If
End If
End Using
Dim defaultStyle As String = _
( _
From style In styleDoc.Root.<w:style> _
Where style.@w:type = "paragraph" And _
style.@w:default = "1" _
Select style _
).First().@w:styleId
' Find all paragraphs in the document.
Dim paragraphs = _
From para In xDoc.Root.<w:body>...<w:p> _
Let styleNode As XElement = para.<w:pPr>.<w:pStyle>.FirstOrDefault _
Select New With { _
.ParagraphNode = para, _
.StyleName = GetStyleOfParagraph(styleNode, defaultStyle) _
}
' Retrieve the text of each paragraph.
Dim paraWithText = _
From para In paragraphs _
Select New With { _
.ParagraphNode = para.ParagraphNode, _
.StyleName = para.StyleName, _
.Text = ParagraphText(para.ParagraphNode) _
}
' Retrieve all paragraphs that have no text and are styled Code.
Dim blankCodeParagraphs = _
From para In paraWithText _
Where String.IsNullOrEmpty(para.Text) And para.StyleName.Equals("Code") _
Select New With _
{ _
.ParagraphNode = para.ParagraphNode, _
.StyleName = para.StyleName, _
.Text = para.Text _
}
For Each p In blankCodeParagraphs
Console.WriteLine("StyleName:{0} >{1}<", p.StyleName, p.Text)
Next
End Sub
End Module
Este ejemplo genera el siguiente resultado:
StyleName:Code ><
Por supuesto, se podría mejorar este ejemplo de diversas maneras.Por ejemplo, podríamos utilizar expresiones regulares para buscar texto, podríamos recorrer todos los archivos de Word de un directorio en particular, etc.
Observe que el rendimiento de este ejemplo es prácticamente igual que si se hubiese escrito en una única consulta.Dado que cada consulta se ha implementado utilizando el modelo diferido, éstas no proporcionarán sus resultados hasta que se itere la consulta.Para obtener más información acerca de la ejecución de evaluaciones diferidas, vea Ejecución aplazada y evaluación diferida en LINQ to XML.
Pasos siguientes
En la sección siguiente se proporciona más información acerca de los documentos WordprocessingML:
Vea también
Tasks
Refactorizar utilizando una función pura
Conceptos
Tutorial: manipular contenido en un documento WordprocessingML